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Preface 



The third International Workshop on Information Security was held at the Uni- 
versity of Wollongong, Australia. The conference was sponsored by the Centre 
for Computer Security Research, University of Wollongong. The main themes of 
the conference were the newly emerging issues of Information Security. Multi- 
media copyright protection and security aspects of e-commerce were two topics 
that clearly reflect the focus of the conference. Protection of the copyright of 
electronic documents seems to be driven by strong practical demand from the 
industry for new, efficient and secure solutions. Although e-commerce is already 
booming, it has not reached its full potential in terms of new, efficient and secure 
e-commerce protocols with added properties. 

There were 63 papers submitted to the conference. The program committee 
accepted 23. Of those accepted, six papers were from Australia, five from Japan, 
two each from Spain, Germany and the USA, and one each from Finland and 
Sweden. Four papers were co-authored by international teams from Canada and 
China, Korea and Australia, Taiwan and Australia, and Belgium, France and 
Germany, respectively. 

Final versions of the accepted papers were gathered using computing and 
other resources of the Institute of Mathematics, Polish Academy of Sciences, 
Warsaw, Poland. We are especially grateful to Jerzy Urbanowicz and Andrzej 
Pokrzywa for their help during preparation of the proceedings. 

We would like to thank the members of the program committee who gave 
generously of their time to read and evaluate papers. We would also like to thank 
members of the organising committee. Finally we thank the authors of all the 
submitted papers, in particular the accepted ones, and all the participants who 
contributed to the success of the conference. 
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Abstract. Generally, low frequency domain may be useful to embed a 
watermark in an image. However, if a watermark is embedded into low 
frequency components, blocking effects may occur in the image. Then 
considering blocking effects, we study some characteristics among DCT 
coefficients and find some interesting mutual relations. Here, the robustn- 
ess of many schemes based on the orthogonal transformations such as 
DGT may be doubtful against geometric transformations. For the di- 
stortions produced by some geometric transformations, we propose a 
searching protocol to find the watermarked block which is rotated and 
shifted. In the proposed scheme, a watermark can remain in the attacked 
image with very high probability in our scheme. Further, the watermark 
becomes more robust than the above scheme using error-correction. 



1 Introduction 

According to the spread of the internet, multi-media become to treat digital 
contents which can be copied easily without any degradation. It produces a very 
serious problem to protect the copyright of digital contents. Watermarking is 
one of the effective schemes to protect the copyright of digital contents. The 
watermarking is a technique to embed some information in the digital contents 
without being perceived. The embedded information can be extracted from the 
watermarked content by a tool. A watermark should convey information as much 
as possible, and it should be secret, which means that only authorized parties can 
access. Further more, it should not be removed from the original contents even 
if the original data were changed by signal processing such as data compression, 
hostile attack, etc. However, if an individual user knows how to embed and 
extract, unauthorized parties may forge the embedded information. Therefore, 
we establish a trusted center which knows the way of embedding and extracting 
in order to keep the copyright of every digital content. Every author who wants 
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to get the copyright must send his original contents to the trusted center and 
get the watermarked contents from it. 

A watermark signal is sometimes designed in the spatial domain [1], but most 
scheme in the transformed domains such as DCT (Discreet Cosine Transform) 
domain[2][3][4] or wavelet domain[5][6] because the signal embedded into the 
transformed domain spreads all over the image. In transformed domain, a wa- 
termark is usually embedded into the high frequency components, as the changes 
of the components may be hard to perceive for man. Here, as mentioned in [7], 
it is difficult to perceive the changes of the very low frequency components as 
well as high. And the low frequency components may be useful to embed a wa- 
termark. However, many authors avoid to apply the low frequency components 
of DCT. Since blocking effects might be appeared in the image if the low fre- 
quency components of DCT would be changed. The blocking effects are usually 
noticeable because the boundaries of the blocks are appeared in the image. 

In this paper, we propose a new watermarking scheme and apply the Stir- 
Mark attack[8][9] to evaluate it, as the attack is used for evaluating the secu- 
rity of many watermarking schemes today. Using the characteristic of addition 
among DCT coefficients, a watermark is embedded into the very low frequency 
components of DCT without the blocking effects in the image. And considering 
the distortions caused by the geometric transformations, the distorted blocks 
where the watermark is embedded are found out in the searching protocol to 
synchronize the orthogonal axes and the positions. Further, encoding to the er- 
ror correcting code, the watermark can become to withstand the effects caused 
by the attacks. 

2 StirMark Attack 

A watermark should retain the important feature such that it should not be de- 
leted even if many kinds of signal processing, such as linear or nonlinear filtering, 
non-invertible compression, addition of noise, clipping, etc, were performed. In 
previous works, there are many watermarking schemes[10][ll][12] to immunize 
such attacks. However, the specific distortions such as rotation, extension and 
reduction have often been neglected in spite of having the important feature 
such that they generate the serious degradation in PSNR(Peak Signal to Noise 
Ratio) being compared with only a little change of the visual characteristic of 
the image. Many watermarking schemes have been evaluated only by their own 
attacks and no standard tool to attack them has been proposed. Then StirMark 
attack, which performs some defaults attacks, has been proposed in order to 
standardize the attacks for watermarking. In the StirMark attack, an image is 
rotated, stretched, shifted and sheared by an unnoticeable amount. The main 
purpose is to give a short overview of the performances of watermarking sy- 
stems and provide a standard tool for comparing them. Of course, some schemes 
withstanding StirMark attack have been proposed, but some problems are still 
remained in them. For example, the amount of embedded information is very 
small[13]. Then other attacks are added to a new version of StirMark attack 
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published in April, 1999. In this paper, we evaluate our scheme using the new 
tool of version 3.1 [9], which includes low pass hltering , color quantization, JPEG 
compression, scaling, clipping, rotation, shearing, horizontal flip, removal of lines 
and columns, FMLR(Frequency Mode Laplacian Removal) attack, etc[8]. 

3 Proposed Scheme 

In this section we consider how to embed watermark information and how to 
extract it. Taking into account of the distortions produced by embedding and 
attacks, we have proposed a new idea such that a watermark information bit 
embedded in a block is extracted from its inner sub-block. 

3.1 Basic Idea 

Generally, the set of basic orthogonal vectors has some interesting features. Then 
we try to analyze the characteristic of DGT. First, we begin with one dimen- 
sional DCT(ID-DCT). Fig. 1(a) shows four low frequency basic vectors from 
to as. When two basic vectors are interacted ingeniously, an amplitude in the 
central region greatly increases and the waveform becomes similar to one of the 
basic vectors. Here we show the waveform of Oq — 02 in the top of Fig. 1(b). 
The waveform surrounded by a bold line is similar to a half scaled waveform of 
Oq. Similarly the waveform of — as is similar to that of ai. This idea can be 




(a) 



«0 



a. 




(b) 



Fig. 1. Low frequency ID-DCT basic vectors and those additive performance 

extended to two dimensional DCT(2D-DCT). Fig. 2(a) shows 16 low frequency 
basic matrices from A(0,0) to A(3, 3). When four basic matrices are interacted 
ingeniously, the similar phenomenon to the case of ID-DGT is appeared. The 
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calculated result of {A(0, 0)-A(0, 2)-A(2, 0)+A(2, 2)} has been shown in the 
top of Fig. 2(b). Then the region in a oblique line is similar to a quarter scaled ma- 
trix of A(0,0). Similarly the region of {^(0, 1)-A(0, 3)-A(2, l)-f^(2,3)} is 
similar to that of ^(0, 1), the region of {^(1, 0)-A(l, 2)-A(3, 0)+A(3, 2)} is 
similar to that of A{1, 0) and the region of {^1(1, 3)-A(3, 1)-|-A(3, 3)} 
is similar to that of A{1, 1). Further more, the amplitude outside of the oblique 
line becomes very small. 
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A (0, 0)- A (2, 0)-A (2, 0) + A (2, 2) 



A (1, 0)- A (1, 2)-A (3, 0) + A (3, 2) 



A (0, D- A (3, 0)-A (1, 1)+ Ad, 3) 




A (2, 1)-A (2, 3)- A (3, 1) + A(3, 3) 




(a) 



(b) 



Fig. 2. Low frequency 2D-DCT basic matrices and those additive performance 



Applying the feature for our scheme, we embed watermark information bit 
in a 32 X 32 block, and extract it from the inner 16 x 16 sub-block which exists 
in the middle of 32 x 32 block. When watermark information is embedded, the 
information bit is added to only four special DCT coefficients in a 32 x 32 block. 
Then the energy given to the four coefficients are spread over the block when 
IDCT is performed. However, when the inner 16 x 16 sub-block is transformed by 
DCT, the almost all spread energy is concentrated into only one DCT coefficient. 
Therefore, the embedded watermark information bit can be extracted from the 
sub-block. The four special DCT coefficients for embedding are specified as four 
columns pi, p 2 , ps and p 4 in Table. 1, and then the coefficient for extracting is 
the special DCT coefficient given in the column P, where F 32 {*, *) means a DCT 
coefficient of 32 x 32 block, and F'i6(*, >t=) is defined similarly. 
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Table 1. Embedding and extracting coefficients 







embedding 




extracting 
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P3 
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F32(0, 2) 


^ 32 ( 2 , 0 ) 


F32{2,2) 


Ei6(0, 0) 
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^ 32 ( 1 , 0) 
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F 32 (3, 0) 


F32{3, 2) 


Ei6(1, 0) 


m 


E32(0, 1) 


^ 32 ( 0 , 3) 


^ 32 ( 2 , 1) 


E32(2,3) 


Ei6(0, 1) 


w 


^ 32 ( 1 , 1) 


E32(l, 3) 


^ 32 ( 3 , 1) 


F32{3, 3) 


Ei6(1, 1) 



3.2 Embedding 

Let I be an image, w = {wo,wi, ■ ■ ■ ,Wn-i), Wi = 1 or 0, be a watermark in- 
formation vector of which size is n, and m be an embedding intensity. Then 
watermark information w is embedded as follows. 



1. / is divided into blocks of 32 x 32 pixel and each block is transformed by 
DCT. 

2. The embedding coefficients in each block are first determined by selecting a 
set using a secret key from four sets given as raws in Table. 1, and then the 
coefficients to be embedded are given by the elements of four columns pi, 
P 2 , P3 and p4- 

3. The element wt in w is embedded in the t-th block as follows. 

If Wt = 0 then 



Pi = Pi + m, 



P2=P2- m, 



P 3 =P 3 - m, 

else then 

Pi = Pi - m, 



Pa= Pa + rn. 



P2=P2 + rn, 



P 3 = P 3 + m, p4=p4- m. 

4. Each block is transformed by IDCT and the watermarked image I' which 
consists of the transformed blocks can be recovered. 



3.3 Searching Protocol 

A watermark should be extracted from the embedded image even if the embedded 
block were shifted and rotated. To search for the amount of shift and rotation, 
we calculate the MSE(Mean Square Error) in each shifted and rotated block, and 
estimate the most possible position where the MSE becomes minimum. Then, if 
the above procedure were performed with every considerable distortion caused by 
attacks such as shift and rotation, the computational complexity might become 
incredibly high. Therefore, we define the domain to search for and the candidates 
of distorted shapes. The block which may be shifted and rotated is assumed to 
be in the searching domain block K of (16 -I- d) x (16 -I- d) shown in Fig. 3, where 
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d is called the searching distance, B is the block to be embedded and L is the 
sub-block. Then, the MSE is evaluated in each shape Ts{0 < s < 12) given in 
Fig. 4, where each shape is obtained by rotating the sub-block slightly. As an 
illegal image must be copied from the watermarked image, it is more efficient to 
use the watermarked image in order to search for the shifted and rotated block 
than to use the original one. But we need not to store both images, and it is 
enough to preserve only the original image because the watermarked image can 
be obtained by embedding the watermark using the secret key when the search 
is performed. 



- image I - 




32 pixel 



C ^ ® ® 

Q D ii ii 

Tl T’5 n Tn 

BBIill 

h 2 ’ i 2 



Fig. 3. Searching domain Fig. 4. The 13 candidates of rotated sub-block 



Let I* be the image which may be copied illegally from I' , and each rotated 
and shifted block is searched as follows. 

1. First, the following operations are performed. 

1.1. The block of the shape Tq in Fig. 4 is picked out from the left upper 
side of the searching domain block K* . Then, the MSE in the block is 
calculated, and the MSE, shape and its position are preserved. 

1.2. Next, the block of the shape Tq is picked out from the position shifted 
one pixel to the right or down and the MSE is calculated. Here, if the 
MSE in the block is less than that of the former, the MSE, shape and 
its position preserved before are changed to this new ones. 

1.3. The operation 1.2 is performed at all possible position in the searching 
domain block K* . 

2. The operation 1 is continued from the shapes T\ to T 12 repeatedly. Here in 
the operation 1.1, the MSE, shape and the position are changed when the 
MSE is less than the preserved one. 

3. Finally, the position and shape where the MSE becomes minimum are sel- 
ected and the rotated and shifted block is reformed to the square shape L* 
of 16 X 16 in each block. 
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3.4 Extracting 

The original image I is necessary to extract the watermark from the image I* , 
because the embedded information can be extracted by subtracting the specified 
DCT coefficient in Table. 1 of L* from that of L. The procedure is given as follows. 

1. The searching protocol is applied for each sub-block and (0 < t < n — 1) 
is obtained. 

2. Each LI and Lt are transformed by DCT. 

3. A coefficient in the set of four elements P given in Table. 1 is specified by 
the secret key. 

4. The coefficient of specified above is subtracted from that of Lt- 

5. If the result is positive, then an extracted information wl = 0, else, = 1. 

3.5 Improvement of Robustness 

Generally a watermark must provide enough robustness even if any attacks might 
be performed. Our watermark has a strong tolerance for attacks, but it can 
not immunize a few attacks such as rotation, shift, etc. One of the reasons 
is that the shapes Ts(0 < s < 12) do not meet every distortion caused by 
rotation. Of course, it is desirable to consider all possible rotated and shifted 
patterns. However, if we would evaluate the MSE with more patterns, it might 
take more computation time and need more memory in order to estimate all 
possible rotated and shifted blocks. To certify it, we have already tested the 
other patterns, but the improvement is only slight in spite of the great increase 
of the computational complexity. Further more, as many kinds of attacks are 
included in StirMark attack, the watermark information bits may be changed by 
some of them. Therefore, we must encode the watermark information bits by an 
error correcting code in order to improve the robustness against their attacks. 

4 Consideration 

DCT coefficients have the general property such that the change of the low 
frequency components causes the blocking effects to the image. For example. 
Fig. 5(a) shows the blocking effects when the value of Fie{l, 0) is changed, where 
f{x,y) means the pixel intensity. However, considering the effects caused by the 
change of some low frequency components, the shape of blocking effects can be 
changed by the attentive modification of the special coefficients. 

The suitable selection and modification of the coefficients can cause special 
effects that are hardly deleted by the attacks such as filterings. The selection 
means to use the special four coefficients in tfie columns pi, p 2 , ps and p 4 in 
Table. 1. And the modification means that a certain constant value is added 
to the coefficients of pi and p 4 , and subtracted from the coefficients of p 2 and 
P 3 - Then, when the procedure given in the basic idea is applied, the energy of 
the four coefficients is concentrated into the specified DCT coefficient P in the 
16 X 16 sub-block. For example, if we select the set ® and a certain value is 
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added to ^32(1,0) and ^32(8, 2), and subtracted from ^32(1, 2) and ^32(8,0), 
the energy of their coefficients concentrates into the DCT coefficient _Fi6(l,0) 
in the sub-block. The result in the pixel domain is shown in Fig. 5(b), and the 
result in the transformed domain is shown in Fig. 6. 

Hence, the distortions produced by embedding are greatly different from the 
blocking effects, and it is less noticeable as no edge is formed. In addition, the 
embedded signal is emphasized in the DCT coefficient T'i6(l5 0), and there is no 
correlation among adjoining 32 x 32 blocks. Further more, some attacks such as 
extension and reduction do not cause a serious problem to the watermark, as 
the embedded signal is spread over outside of the sub-block. 




Fig. 5. Distortions in a block 




Fig. 6. Concentration of energy into the DCT coefficient Fiq(X, Y) 
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5 Computer Simulated Results 

In our simulation, we use a standard image “lenna” and standard images “girl” , 
“baboon”, “couple”, “aerial”, “moon” and “map” in SIDBA(Standard Image 
Data Base), each of which has 256 level gray scale with the size of 256 x 256. 
Then the maximum number of watermark information bits is 64 bits. 

First, we show some results for “lenna” of which original image is shown 
in Fig. 7. Fig. 8 is produced by embedding a watermark into the original image, 
where embedding intensity m = 30, PSNR = 42.5 [dB], and Fig. 9 is an image pro- 
duced by applying the well-known StirMark attack to the watermarked image. 




Fig. 7. Original image 

It has no obvious visually difference from the original watermarked image though 
its PSNR decreases to 18.4[dB]. Here, we need not to say that the embedded 
watermark can be extracted also from Fig. 9. 

Fig. 10 shows the average value of PSNR respect to embedding intensity m. 
Here, if m is set over 30, the distortions become to be remarkable in the water- 
marked image as shown in Fig. 11. And if m is set under 15, the watermark is 
deleted easily by the StirMark attack. Therefore, the suitable range of m is bet- 
ween 15 and 30 for the image “lenna” . For many images, our simulated results 
show that the suitable range of m is also from 15 to 30 though the distortions in 
the watermarked image can be blurred even if m is set greater than 30 for the 
image which includes many edges. Then we use the range from 15 to 30 in the 
following computer simulation. 

If the StirMark attack which parameters are defaults is performed, a very 
few errors may be occurred in the extracted watermark information. Then, the 
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Fig. 8. Watermarked image (m = 30) 




Fig. 9. StirMark attacked image 
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Embedding intensity m 



Fig. 10. PSNR versus embedding intensity m 




Fig. 11. Watermarked image (m = 50) 
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number of errors may be less than or equal to 3. In such a case, the watermark 
information can be extracted correctly by use of an error correcting code. Then 
we will perform the computer simulation under the assumption that the water- 
mark information has already been encoded by some error correcting codes , and 
hence, the extracted information has been corrected if the number of errors is 
less than or equal to the error correcting ability. Fig. 12 shows the probability of 
the correct extraction for the searching distance d(4 < d < 8) and the embed- 
ding intensity m = 20. The best results is obtained when the searching distance 




Fig. 12. Probability of correct extraction versus searching distance 

d = 8. However, it is efficient to apply for d = 6 as the computational complexity 
is about half compared to the case of d = 8. Therefore we define the searching 
distance d = 6 in the following simulation. 

Next, we evaluate the tolerance of each watermarked image against the Stir- 
Mark attack. Table. 2 to 5 show the computer simulated results of the correct ex- 
traction probability for each different error correcting abilities, when the number 
of simulated times is 10^. The watermark is generated and embedded randomly 
in each time. The results mean that the watermark information can be extrac- 
ted almost correctly from the image distorted by the StirMark attack when the 
embedding intensity is set to m = 30 and some triple errors correcting codes 
are applied. Here, if soft decision decoding is applied, the probability of correct 
extraction can be increased. For the images such as “aerial” and “map”, the 
probability of the correct extraction becomes smaller than the others, because 
above two images include a lot of vulnerable edges by attacks. Then, for those 
two images, the embedding intensity can be set greater than 30, and the large 
distortions caused by the embedding will be blurred because of containing many 
edges. Therefore it is more preferable to set the suitable embedding intensity m 
for each image considering how many edges are included in the image. 
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Table 2. Correct extraction probability [%] for the case of no error correction 



m 


lenna 


girl 


baboon 


couple 


aerial 


moon 


map 


15 


14.9 


19.5 


9.0 


23.7 


0.3 


44.5 


0.6 


20 


49.4 


51.3 


37.4 


67.9 


7.7 


83.3 


4.6 


25 


73.7 


67.8 


63.6 


89.5 


28.9 


95.3 


12.0 


30 


89.0 


78.5 


84.0 


97.6 


58.2 


99.2 


22.9 


Table 3. Correct extraction probability [%] for single error correction 


m 


lenna 


girl 


baboon 


couple 


aerial 


moon 


map 


15 


44.5 


53.7 


31.2 


57.7 


2.0 


80.3 


4.2 


20 


85.2 


88.1 


74.9 


94.4 


28.1 


98.6 


20.3 


25 


96.4 


95.8 


92.6 


99.4 


65.4 


99.9 


39.6 


30 


99.4 


98.3 


98.5 


100.0 


90.0 


100.0 


59.2 



Table 4. Correct extraction probability [%] for double errors correction. 



m 


lenna 


girl 


baboon 


couple 


aerial 


moon 


map 


15 


72.4 


80.7 


57.9 


82.3 


7.2 


94.3 


13.8 


20 


97.1 


98.3 


92.8 


99.4 


54.6 


99.9 


45.1 


25 


99.7 


99.7 


98.9 


100.0 


87.8 


100.0 


68.9 


30 


100.0 


100.0 


99.9 


100.0 


98.4 


100.0 


85.0 



Table 5. Correct extraction probability [%] for triple errors correction 



m 


lenna 


girl 


baboon 


couple 


aerial 


moon 


map 


15 


89.3 


93.9 


79.0 


94.1 


17.5 


98.4 


30.1 


20 


99.6 


99.8 


98.4 


100.0 


76.4 


100.0 


69.5 


25 


100.0 


100.0 


99.9 


100.0 


96.7 


100.0 


87.9 


30 


100.0 


100.0 


100.0 


100.0 


99.8 


100.0 


96.1 



Finally, our scheme can withstand not only StirMark attack, but also the 
well-known unZign attack[14], JPEG compression which quality parameter can 
be set under 25[%] and most of image processing. 



6 Conclusion 

We have proposed a new watermarking scheme in the tolerance of StirMark 
attack, and it has three important points. First, a watermark is embedded into 
the very low frequency components of DCT, then the blocking effects will not be 
appeared in the image. Next, the watermark embedded in a block is extracted 
from its inner sub-block. Finally, the amount of rotation and shift can be found 
in the searching protocol when the watermark is extracted. 

In our scheme, the watermark can be extracted correctly with very high 
probability though the perceptual distortions caused by embedding are very 
few. Our simulation results mean that the strength of the watermark depends 
on the image and hence the suitable setting of parameters is inevitable. 
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Abstract. We present a scheme for embedding secret or public readable 
watermarks into 3D models consisting of polygonal or NURBS surfaces. 
The scheme realizes affine invariant watermarks by displacing vertices 
(control points) and satisfies constraints regarding maximum tolerated 
vertex movements or, in the NURBS case, diffcrenees of original and 
watermarked surfaces. The algorithm uses the volume of two tetrahe- 
drons as an embedding feature. The scheme described can be stacked 
on a more robust scheme allowing transmission of labeling information 
to the user or increasing blind detection capabilities of the underlying 
scheme. The paper makes two major contributions, both driven by real 
world requirements: The first one is a technique to cope with reduced 
precisions of vertex coordinates. Real world modeling applications re- 
present vertex coordinates with single floating point precision. Vertex 
coordinates in VRML scenes are represented by 6 decimal digits or even 
less. Mesh compression schemes may quantize vertex coordinates to even 
below precision of 4 decimal digits. The second contribution of this paper 
is a general technique for reducing processing time of watermark (label) 
extraction satisfying impatient users and enhancing robustness with re- 
spect to affine transformations and, in particular, vertex randomization 
attacks. The technique is based on simplifying the mesh applying edge 
collapses prior to watermark embedding and retrieval. The technique 
depends on a consistent order of vertices in embedding and retrieval pro- 
cess. We sketch possible extensions of the proposed scheme. 

Keywords: Watermarking, 3D Polygonal Model, NURBS Surfaces, Pu- 
blic readable Watermark, Secret Watermark, Affine Transformation. 



1 Introduction 

The possibility of watermarking 3D content, models and virtual scenes, allow 
following applications: Public readable watermarks [1,4] can provide users of 3D 
data with information and links about creator and license information. Water- 
marks may contain model classification information in order to support data re- 
trieval operations. Through embedding of version or history related information 
watermarks can also support the workflow of data. Authenticity and integrity of 
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content can be proven to the recipients by embedding signed hash-values of the 
content, together with further information related to the signers identity, e.g. 
certificates or just links for the certificate lookup [1]. Public watermarks may 
be designed to be robust and to withstand common operations performed by 
users of contents, e.g. placing a model in a virtual scene and rescaling it to the 
desired size and shape. Fragile watermarks lose their content as a result of a 
predefined set and strength of alterations. Secret (private) watermarks target at 
the applications proof of ownership and tracing of watermarked copies. Robust 
private watermarking schemes come along with an increased amount of (a priori) 
information and processing required for watermark retrieval [8] . Since their main 
purpose is proving ownership these burdens are accepted. 

In this paper we describe a watermarking scheme for realizing affine invariant 
secret and public readable watermarks and how they can be applied to polygonal 
3D models and models consisting of tensor product surfaces based on B-splines, 
in particular NURBS surfaces. 



1.1 Related Work 

Ohbuchi et al. [4] published the first algorithm realizing secret or public reada- 
ble affine invariant watermarks named TVR (Tetrahedral Volume Ratio). The 
algorithm uses the ratio of two tetrahedrons as an affine invariant embedding 
primitive. Our algorithm is based on the same affine invariant feature which, in 
contrast to [4] exhibits the following properties: 

- Applicable to non- 2-manifolds. 

- Fulfills constrains regarding tolerated vertex displacement. 

- Copes with limited number of mantissa digits in vertex coordinates. [4] de- 
pends on double precision coordinates. 

- Does not depend on surface orientations (if however reliable surface normals 
are available, this information is used for enhancing capacity). 

[9] and [1] propose schemes realizing affine invariant watermarks utilizing an 
affine invariant norm [3]. Both schemes work for single precision coordinates. 
While the first scheme embeds a visual watermark of 1-Bit type suitable for in- 
tegrity checks (beside usual secret watermark applications), the second scheme 
is capable of embedding local watermarks containing real content (strings). In 
comparison to the affine invariant scheme in [1], the scheme establishes explicit 
limits with respect to numerical stability (mantissa length) and provides larger 
capacity. 

In the robust watermarking scheme proposed by Praun et al. [8], rigid trans- 
formations and uniform scaling that were applied to a watermarked copy are 
reversed in a matching process (model registration) using the original model. 
Improving their registration process to handle affine transformations can be 
considered as a solvable task. In [1] we described a robust scheme called Nor- 
mal Bin Encoding (NBE) targeting for blind detection, exhibiting robustness to 
non-uniform-scaling and general affine transformations of low ” strength” only. 
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As stated before, our scheme’s purpose is to be placed on top of such a robust 
scheme, transmitting label information to users, or providing at least some blind 
detection capabilities for non-blind detection schemes. In [6] Ohbuchi et al. pre- 
sented a general scheme suitable for embedding secret watermarks into NURBS 
based curves and surfaces utilizing reparametrization [7] . Since the scheme does 
not alter control points, it preserves the shape exactly, an important property 
for data sets in the CAD field. It requires the original model for watermark re- 
trieval. The capacity is limited since the number of NURBS surfaces of a model 
limits the number of reparametrizations. If weights and/or knot vectors are not 
alterable, the scheme is not applicable. In contrast to this our scheme aims at 
secret and public readable watermarks. Since watermarks are embedded through 
modification of control points, the shape is not exactly preserved but the scheme 
allows fine control over vertex-displacements. The original is not required in the 
retrieval process. The applicability of our scheme depends on the applications 
of 3D data: Changes of geometry may be tolerable for models in the fields of 
e.g. character modelling and animation. For certain CAD data sets no geometry 
changes may be tolerated. 

2 The Embedding Primitive 

We use the ratio B, of the volume of two tetrahedrons Ti , T 2 as an embedding 
primitive. Denote the volumes Ui, V2- 




The vertices of tetrahedron T 2 are not changed in the embedding process. The 
ratio R is changed by moving a designated vertex named uq of tetrahedron Ti, 
which is not a vertex of T2, perpendicular to the plane constituted by the other 
vertices of Ti. Denote the three other vertices vi,V 2 and ^3. Let R be represented 
as a floating point number with the notation (R ^ 0 assumed): 

R = 0 .Im„_ 2 ..mo 

with rrii G {0, 1}, 0 < i < n — 2 and n being the number of base2 mantissa digits. 
The digit with the index n — 1 has the value I. We embed a bit-string of length 
N into a ratio by moving the vertex vq so N consecutive ratio mantissa digits 
match the bit-string. The exact nature of the embedded information and which 
mantissa digits are used for embedding is explained in detail in section 3. 

2.1 Grouping of Embedding Primitives 

Let’s assume a bit-string BSgiobai of length Lgiobai is going to be embedded 
into a polygonal model. The algorithm basically operates on triangular mes- 
hes. In our scheme we group several single embedding primitives to units called 
Grouped Embedding Primitve (GEP). Each CEP stores information of the form 
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{index, string), index gives the position of the bit-string string in the global 
bit-string BS global- Each GEP consists of two edge adjacent start faces Fi and 
F 2 and all vertices adjacent to them. Denote these vertices V. The sets of vertices 
associated with each GEP are disjunct. 



Vs 




Fig. 1. A Grouped Embedding Primitive (GEP). 

Eigure 1 shows a GEP. The start faces Fi and F 2 form the tetrahedron T 2 
(with additional edge (vijWs)) with associated volume V 2 . Denote the shared 
edge 6 s ■ Each of the vertices V 4 to vg forms a tetrahedron Ti together with three 
vertices of faces Fi and Fg. Each of these tetrahedrons, together with T 2 forms 
an embedding primitive. 

We now put these embedding primitives in order: If we can assume a fixed 
numbering of vertices (later we will make this assumption for the NURBS case), 
we simply sort these primitives based on vertex indices. However in the usual 
case of polygonal models especially cut & merge operations may affect vertex 
ordering, so we follow an alternative strategy: Eirst we sort vertices v E V 
(primitives) based on their general topology with respect to the start faces, then 
we sort equally ranked vertices by their number of adjacent points and faces. 
Eor sorting based on topology we assign each vertex n G U to a certain set Cj 
(0 < f < 5). Cj is assigned all vertices part of faces edge adjacent to i different 
edges of the start faces. Each of these sets can be partitioned further, depending 
on additional information available: 

1. El and Fg can be ordered. 

2. Vertices shared by Fi and Fg can be ordered based on surface normals (ori- 
entation) . 

Please note that we can ’’provide” this information by performing brute force 
testing. For the hrst information we could simply test both start face combina- 
tions for watermarks. 

The upper half of Figure 2 shows possible partitioning of sets cg and ci. With no 
further information available we can distinguish three cases in set cg and three 
in Cl. The lower part shows the case for C 2 , adjacent edges highlighted. With no 
further information we have four cases, if start faces are ordered six cases and 
hnally, if we can rely on face normal of e.g. Fi, ten cases. There are meshes for 
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Table 1. Distinguishable cases for each set d. The second column lists number of 
distinguishable cases if no further information is provided, the third column shows 
the cases, if start faces are ordered and the fourth column mentions the cases, if, 
additionally, reliable surface orientation is present. 



Set 


Distinguishable cases 


Co 


2 3 4 


Cl 


2 3 5 


C2 


4 6 10 


C3 


4 6 10 


C4 


2 3 5 


C5 


1 1 1 









C2 case 



+ face ordering 





+ orientation 



Fig. 2. Ordering of vertices. 



which the second sorting step does not increase the number of distinguishable 
cases: Consider a mesh in which every vertex is regular (has valence six), e.g. 
inner points of a triangulated lattice. We find three distinguishable cases with at 
least one associated vertex (after both sorting steps). Ordering the start results 
in five cases and finally, if we can rely on a start-face -normal, eight cases. Table 
1 summarizes distinguishable cases for sets Cj. 



3 How Watermark Bits Are Embedded 

Denote the position of the leftmost non-zero mantissa bit n. Now we want to 
select an embedding-range of mantissa digits ..,mr with I — r 1 = 

N and 0 < l. r < n X — \ (X being the maximum number of digits we 
may append left of the bit at position n) in which we store watermark related 
bits. Let N be the length of the range. The choice of I and r is restricted by 
certain constraints. First, the maximum tolerated vertex movement (specified as 
Euclidean distance) of vertex vq gives an upper bound for I which we denote with 
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Imax- Second, we want the watermark (at least) to remain stable if the number 
of mantissa digits in floating point values used to represent vertex-coordinates 
is truncated to a certain amount, named output- precision, which gives a lower 
bound for r which we denote rmin- Using a simple approach we could select 
constant values for I and r which satisfy the mentioned constraints for ’’most” 
embedding primitives encountered in the model. This approach only succeeds in 
the case when vertex coordinates are represented with sufficiently high precision, 
e.g. using IEEE double floating point values. We therefore developed a strategy 
that copes with low vertex coordinate precision by dynamically adjusting a range 
of constant length N for each embedding primitives case. 

3.1 Calculation of Dynamic Embedding Ranges 

Denote the output-precision to which model vertex coordinates will be truncated 
after watermark embedding with OP. Lets assume a bit pattern P = pq, ..,pjv-i 
of length N {pi G {0, 1}) is going to embed in a primitive. Due to reasons ex- 
plained later, we choose pn ~2 = 1 and pN~i = 0. Let E: n*x — d = 0he the 
plane equation of vertices vi,V 2 ,V 3 in HESSEsche Normalenform. Denote the 
maximum tolerated vertex movement tol. L is the largest index of a mantissa 
digit of ratio R, which changed its value when n ■ tol is added to vq (all vertices 
truncated to output precision OP). 

Starting from I := Imax = L — 1 we set the ratio-mantissa bits to the pattern 
bits yielding a new ratio R' and calculate the required height-change Ah of 
tetrahedron Ti for encoding this specific ratio: 

G 

Since the coordinates of vg are represented using limited precision, we do not 
achieve the wished (exact) height adjustment Ah, instead we get Ah 6h with 
5h being the applying error. We are interested in how the output of the ’’height- 
calculating” function depends on input values. Let’s discuss error propagation 
in more general terms: Think of the ’’height-calculating” function as a general 
vector function 



Let Sx denote the error applying to input values. Then the error of the output 
values 6y is 



Sy = p{x Sx) — (p{x) 

If (fi is differentiable two times, we can estimate the Sy by using the TAYLOR- 
approximation 
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For the actual problem we have di = 3,^2 = 1- Let x = (xq xi ^ 2 )^ be the 
adjusted Vq, let 60 , 61,62 be the exponents of the coordinate values x, in base2 
floating point representation. 

^3. _ ^2®o-(OP+l) 2ei-(OP+l) 2e2-(OP+l)^T 

<p(x) = n * X — d 
6h = n * Sx 

We got a stable embedding (for output precision OP) of pattern P, if 

2V2 



with cr being the exponent of ratio R in base2 floating point representation. 
The last two embedded bits were ”10”. These guard against changes of following 
mantissa bits and may change due to the error applying to the new vq vector 
coordinates. Denote these bits buffer bits. 

We repeat the process with I decreased by one until we get a non-stable em- 
bedding. We set rmin to I — N + 1 [I being the last left range boundary which 
yielded a stable embedding). 



4 How Watermark Bits Are Retrieved 

Assume that the retriever has no access to the original model at retrieval time 
and does not have any knowledge of the embedded patterns and constraints 
regarding maximum tolerated movement of vertices. Further assume that the 
retriever knows about the output-precision used in embedding, the length of 
embed patterns and the number of buffer bits which were constant throughout 
the embedding process. We will now show that this information is sufficient for 
retrieving embedded watermark data in our scheme. 



4.1 Estimation of Embedding Range 

We apply the same test- for- stability-procedure as described in section 3.1 for 
calculation of rmin- Since we don’t know the original model and the embedded 
patterns, we use the pattern P — 0..0110 of length N as ’’embedding” -pattern. 
Starting with I at the position of the leftmost nonzero mantissa bit, the test-for- 
stability-procedure yields an estimate of r^mm (not in terms of absolute position) 
used in embedding process which we denote The value rmin is expected 

to lie in an interval [r'min ~ ^f’lf'min + ^ certain value Ar. We find out 

the original Tmin again by means of testing. For this scheme to work, several 
embedding patterns need to depend on each other. We next describe the testing 
procedure and the process of introducing dependencies which we denote chaining. 
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4.2 Chaining of Embedding Patterns 

Suppose we ordered M embedding primitives EPq, EPm-i for embedding 
NP bit-patterns Pi of length N-l-2 {i = 0,..,NP — 1). Let Pi be pattern Pi 
excluding the constant lower buffer bits (assume two buffer bits ” 10” , but longer 
constant ”10” sequences may be applied). If we denote the actual embedded 
patterns (without constant sequence) as Ai, and a knot-vector K = (ko..kM-i), 
our chaining method can be described as follows: 



^0 = Pko 

Ai=MIX{Ao®Pk^) 

Am-1 = MIX{Am-2 © Pku-i) 

The addition © above is modulo 2^ . MIX{) is a bijection Zjv — '^n with the 
following properties: Mapped values are uncorrelated and the mapping can be 
made dependend on a key. The knot-vector K specifies the number of times a 
pattern is embedded. 

Example: NP = 2 and = (0 00111). In this case each of the two patterns 
is embedded three times. 

Retrieval of patterns from ordered primitives EPq, EPm~i goes as follows: 

Bo = Ao 

Bi = MIX^\Ai)eAo 
B(^M-1)=MIX ^(^Im-i) © d.M-2 

The subtraction 0 is modulo 2^ . Using the knot- vector in the example before, 
the retrieval algorithm chooses all Aq patterns within a certain interval around 
estimated Next for each candidate, the pattern Ai is chosen from a similar 
interval and Bq = B\ is tested. The values Bq of matching candidates are stored 
and the according Ai values are used to calculate value B 2 and test for B2 = BO. 
Then the process is repeated for retrieval of the second pattern. Finally if we 
found a pattern A 5 which yielded B 5 = B 3 we trace back and collect the accor- 
ding Bo pattern. Then we assign Po := Bo and Pi := B 3 . 

The described dependencies among embedding patterns lowers the probability 
of falsely retrieving patterns for small pattern length N. For realization of secret 
watermarks the mixing function MIX{) can be made dependend on a secret 
key. This can be achieved for example, by representing the mixing function as a 
dictionary and permuting ’’translated” values depending on the key. 
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5 Watermarking of NURBS Surfaces 

So far we have described how the scheme is applied to polygonal models. Next 
we explain, how the scheme can be applied to NURBS surface control nets and 
particular constrains regarding surface ’’differences” can be fulfilled. 



5.1 Restricting Control Point Movements 

Next recapitulate the definitions of a B-spline basis function and NURBS Curve. 
We follow the notations of Piegl and Tiller [7]. 






1 a Ui < u < tti+i 

0 otherwise 



Ni,p{u) = 



U — Uj 



-N, 






i,p— 1 



(u) ■ 



H-\-p-\-l "^2 + 1 



-N, 






(u) 



^i,p (tt) 



Nt^p{u)wj 

ELo 



C{u) = y^Ri^p{u)Pi 

i=0 



Assume the degree of the polynom of curve segments is p, the number of control 
points is n+1, the number of knots m+1 = n+p + 2. Denote the control points 
with Pq, ..,Pn, the knots with uq, Assume all weights Wi > 0 (0 < i < n) . 

We embed information into C'(u) by moving control-points which yields the curve 
C'{u). Denote the new control points wit Pq, ... P^. We want restrict the changes 
so for for Uq < u < Um and a specific distance d the following is satisfied: 

\C'{u) -C{u)\ <d (1) 

Since for all u e [mq • ■ . Wm] = 1 (1) can be easily satisfied by 

restricting control point movements:o 

\P'-Pi\<d 

for 0 < i < n. This result can be easily extended to the case of tensor product 
NURBS surfaces. A NURBS surface of degree p in the u direction and degree q 
in the v direction is defined by: 






E 



n 



Ni^p{u)Nj^q{v)Wij 

E ™0 Nk,p{u)Nl^q{v)wk,l 



s{u,v)=j2Y1 

i—0 j—0 
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If the control points movement is limited by d, the displacement of the outcome 
of ’’inner” curves evaluation is limited by d each. Again constructing and evalua- 
ting a curve from control-points with displacement limit d yields a result with 
applying limit d. 

Moving control points changes a curve/surface first and second derivatives for 
which we did not pose restrictions. Furthermore control point movement changes 
the curve/surface properties regarding geometric and parametric continuity. For 
minimizing these effects we applied following simple strategy in the experiments 
described in next section: Identical control points, of one or several NURBS 
surfaces, were treated as one single point while other points were only moved 
by a maximum of fc-times (fc < 1/3) the (Euclidean) distance to their nearest 
neighbor. 

5.2 Alteration of Control Points 

Assume a NURBS based model consists of several nurbs surfaces and hence con- 
trol nets. First we convert each control net (a lattice) to triangle representation 
by dividing each lattice ’’cell” into two triangles. Next we merge the resulting 
triangular meshes by joining identical points and apply our algorithm. Finally we 
propagate point movements back to the original control nets. As an alternative 
surfaces can be processed independently. In this case the algorithm is applied to 
the inner control points. 



6 Decreasing Processing Time and Enhancing Robustness 

We are interested in speeding up the retrieval process to satisfy impatient users 
by reducing the problem (mesh) size. The number of start face combinations to 
be tested is A = 2{v + f — 2) for a regular closed mesh, with v,f being the 
number models vertices and faces respectively. We further assumed performing 
exhaustive search and dependency of scheme on the order of start faces. 

We reduce the problem size by applying a mesh reduction step, prior to em- 
bedding. The embedding algorithm is applied to a coarse representation of the 
mesh in which each vertex relates to the center of mass of a set of vertices in 
the original mesh. These sets are disjunct. A further benefit of the to proposed 
technique is enhanced robustness with respect to point randomization, or simi- 
lar, quantization of vertex coordinates. We identify following requirements for 
the mesh reduction process P: 

- P ’’should” not apply any iion-afliiie transformations to the vertex set. Other- 
wise information embedded into affine invariant features is going to be lost. 

- Any topology changes applied to the mesh by P must be driven by affine 
invariant features. Otherwise the decimation of the watermarked and the 
attacked watermarked copy would lead to different topologies. The scheme 
proposed in this paper as well as other schemes [1,4,9] rely on (at least local) 
fixed topology. 
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- All schemes mentioned basically rely on the ratio of volumes of tetrahedrons. 
In general ratios of largely varying volumes and small volumes in flat regions 
cause numerical instability to affine invariant features values. We want the 
decimation step not to introduce degenerate faces. Starting from an equally 
shaped mesh, the decimated result should be equally shaped too. 

Our to be described simplification method achieves following set-properties: 

- Equal distribution of vertices among sets. 

- Vertices of a set are connected and neighbored (not necessarily in the Euc- 
lidean distance sense but in the sense of shortest connecting path length 
(assuming all edges of length 1)). 

- Grouping of vertices to sets does not depend on geometry. 

The decimation step furthermore raises face normal differences of adjacent faces, 
hence increasing volumes in features, which is a desired property. 

We decimate the original mesh to a desired number of vertices by performing 
a sequence of half edge collapses [2,8]. The to be collapsed edges are taken 
out of a priority queue in which they are ranked based on following criterion: 
Sum of number of vertices that have been collapsed to edge-vertices, maximum 
number of vertices collapsed to one of edge edge-vertices, first edge-vertex id, 
second edge- vertex id (first edge-vertex id is always less second edge-vertex 
id). We apply the embedding algorithm on the decimated mesh and propagate 
back the coarse mesh- vertex displacements to the vertices in their according sets. 
For watermark retrieval, we perform exactly the same reduction step prior to 
applying the retrieval algorithm. The proposed technique bears one important 
limitation: Due to the priority criterion applied, the proposed method depends on 
a fixed ordering of vertices. If a mesh undergoes cut and/or combine operations 
after embedding the ordering is destroyed. 

7 Practical Experiments 

The scheme described was implemented as part of the GEOMARK- system, a 
collection of algorithms dedicated to 3D watermarking. The system is implemen- 
ted both as 3D Studio MAX plugin and command line versions. A snapshot of 
the system is given in the second row of Figure 5. 

The left image of Figure 3 summarizes the results of following test case: A wa- 
termark of length 32 bytes was embedded into a polygonal viper car model, 
visualized in third row of Figure 5, consisting of 20670 vertices and 38732 tri- 
angular faces and a reduced version consisting of 5670 vertices and 10345 faces. 
We applied our proposed simplification technique for enhancing robustness and 
reducing processing time. To clarify: Changes made to a coarse version of a mo- 
del are propagated back to the full resolution model. Due to limited capacity 
a 14 byte watermark was embedded in another reduced version consisting of 
670 vertices and 825 faces. We used following embedding parameters in all three 
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cases: The pattern length was 7 Bit, the number of buffer bits was 8, the ma- 
ximum tolerated vertex movement was 0.005 times the bounding box diameter. 
Next we applied a point randomization attack by adding to each vertex a dif- 
ferent random vector of length Ljinjj pointing in random direction to each of 
the vertices. In the left image of Figure 3 the fraction of successfully retrieved 
watermark values is drawn against the length Lrn d of the random vector which 
is expressed in fractions of the bounding box diameter and drawn as logj^pO. The 
three line-strokes drawn represent from left to right: original, 5670 vertex- and 
670 vertex-version. 

Figure 4 visualizes the following test case: A 32 byte watermark was embedded 
into the head NURBS model for output precision of 23 bit. The number of buffer 
bits was 8, maximum tolerated vertex movement was 0.001 times the bounding 
box diameter. The images show percentage of stable embedding primitive values 
after applying affine transformation which is carried out using single precision 
(left image) and double precision (right image). Coordinates were truncated to 
specified mantissa length (x-axis) before and after applying affine transforma- 
tion. Right curves in each figure show results for undecimated (6100 vertices, 
11760 faces), left curves show results for decimated control net (800 vertices, 
1391 faces). 

Table 2. Timings for test cases. First value is processing time required for simplifica- 
tion, second for embedding, third for retrieving. Times are given in minutes '.seconds. 



Model vertices, faces timings (simplify, embed, retrieve) 



viper 20670,38732 
5670, 10345 
670,825 

head 6100,11760 
800,1391 

helmet 21454,35136 
169 surfaces 
6454,7807 



/ 0:08 3:07 

0:01 0:01 0:55 
0:06 0:01 0:01 
/ 0:01 1:21 
0:04 0:01 0:01 
/ 0:02 2:31 

/ 0:31 1:20 

0:04 0:01 0:08 



Table 2 lists processing times of test cases for model simplification, embedding 
and retrieval of a watermark. The timings were measured on a 333 Mhz PII 
running Windows NT4.0. The C-| — h code was not optimized and no precautions 
were taken to speed up the retrieval process, so the timings are those of a worst 
case scenario. Please note that we do not need to test all start face combinations 
in retrieval process. For example, we could sort faces with respect to topology 
prior to watermark embedding and retrieval and/or simply stop retrieval if a 
certain number of bits was retrieved. If we embed a watermark several times, we 
could stop after the first successful retrieval. 

The second line of helmet NURBS model timings was a special case: Here we 
did not merge the NURBS surfaces to one single mesh, instead we processed the 
surfaces individually. The helmet model consisted of 169 NURBS surfaces. 
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random factor 



Fig. 3. Percentage of stable embedding primitive values of a polygonal viper car model 
of three simplification levels after applying a point randomization attack. 




Fig. 4. A 32 byte watermark was embedded into head NURBS model for output preci- 
sion of 23 bit. The images show percentage of stable embedding primitive values after 
applying affine transformation which is carried out using single precision (left image) 
and double precision (right image). Right curves in each figure show results for full 
resolution, left curves show results for simplified control net. 

8 Conclusion and Future Extensions 

Real world requirements make great demands on watermarking schemes. Even 
the realization of semi-robust schemes suitable for transmission of labeling trans- 
formation is a non-trivial task due to the constraints regarding vertex coordinate 
precision and requirements regarding robustness to frequently performed opera- 
tions, e.g. affine transformations. We proposed a watermarking scheme realizing 
robustness with respect to affine transformations for polygonal and NURBS sur- 
face based models. The scheme copes with reduced vertex coordinate precision 
encountered in real world applications. However, the watermarks are not robust 
against connectivity-altering (remeshing) operations. Realizing robust schemes 
providing enough capacity and of course blind detection capabilities for label- 
ling applications is an open issue of research. Taken into account that we have 
to deal with single or double floating point precision in most cases and that the 
number of pattern and buffer bits (or possible range of numbers) is hardwired in 
the decoder, the scheme allows for blind detection of watermarks. We currently 
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Fig. 5. First row: Left head is original NURBS head model, right head contains 32 
byte watermark. Second row: Left image shows original helmet NURBS model to the 
left, the right helmet contains a 40 byte watermark (a bitmap of size 20x16). The right 
image is a snapshot of the GEOMARK-plugin showing the retrieved bitmap. Third 
row, from left to right: Original triangular viper model, watermarked copy containing 
32 byte watermark, model after applying the affine transformation used in experiments. 

investigate extensions to make patterns of an embedding primitive even ’’more 
dependent” on each other, e.g. by adding the start position in mantissa of the 
ratio to the parameters for generation of the next pattern. This should enable us 
to lower the pattern size and hence both increase capacity and robustness with 
respect to quantization of coordinates. 





AfBne Invariant Watermarks for 3D Polygonal and NURBS Based Models 29 



The mesh simplihcation technique proposed is in fact affine invariant, succes- 
sfully reduces computational time, enhances robustness but requires a fixed 
vertex ordering which is an important limitation. Our current work therefore 
covers an extension in which vertices are ordered locally, starting form a desi- 
gnated edge. Fast identification of this edge is achieved through embedding a 
trace-watermark (no real content, just a trace) in the edge-embedding primitive 
using the fast, robust, low capacity scheme described in [1]. After ordering all 
vertices in an n-ring neighborhood of the start edge, the simplification step is 
applied. 

Acknowledgements. The head and helmet NURBS-models are courtesy of 
Paraform Inc. 

(www.paraform.com), the viper car model originates from www.3dcafe.com. 
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Abstract. Steganography is a method of communication which hides the exi- 
stence of the communication from a third party. The employment of steganographic 
techniques depends on various demands, which will be derived and considered. 
This paper regards different aspects of digital images. These are representation, 
storage and as a result also suitability for their use in steganography. Based upon 
this, some criteria for the choice of suitable cover images are pointed out. A new 
technical steganographic system, which uses reference colours as starting point for 
embedding and extraction of information has been developed and will be descri- 
bed. This system can be adapted flexibly to different requirements and considers 
the human visual system as well as the derived statements for potential cover 
images. Results are described which justify the new approach. 



1 Introduction 

There has been significant recent interest in steganography during the last years for diffe- 
rent reasons. It is generally known that the security of a communication can be increased 
through cryptographic encoding. However, the existence of an encrypted communication 
is visible to a third party which may be a disadvantage. By the using of steganographic 
methods, additional information can be hidden imperceptible in insignificant objects. 
Such a communication is not recognisable and the communication partners will not be 
compromised. 

Images, as representatives for visual information, play already always a great role 
in the life of people. That’s why they can be used as cover objects for concealed com- 
munication. For steganography in digital images different aspects and demands must 
be considered. These include the properties of human visual perception, suitable cover 
images, user preferences and the characteristics of the used steganographic process. If 
there are changes at runtime, much proposed methods can’t be adapted and using the 
same fixed procedure for each image. Thats why much of the resulting stego images 
showing visible distortions and indicating an embedded additional information. That 
an adaptation of the embedding procedure at runtime is possible we will show in the 
presented work, which is organised as follows. Section 2 introduces demands and their 
priorities in steganographic systems, derived from the demands on common information 
hiding techniques. Section 3 briefly introduces various methods for embedding informa- 
tion in images and classifies current tools and techniques. Furthermore, we will discuss 
a property of the HVS and his exploitation. Section 4 introduces a new possibility to 
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exploit the HVS for information hiding through the use of the discussed effects in the 
visual perception process. Because in steganography cover images can chosen freely, in 
section 5 different rules for the selection of suitable covers are given. Section 6 shows 
a new approach of an adaptive steganographic framework which is introduced by an 
example implementation. 




Fig. 1. Demands on information hiding techniques [RosOO]. 



2 Demands to Steganographic Systems 

There are different demands for information-hiding (Fig. 1). These are mutually contrary 
and can not be maximised simultaneously. Thus, they have to be prioritised according 
to the information-hiding problem at hand. 

In steganography we have the following prioritisation of global requirements: the 
imperceptible embedding is the main requirement and must be highly prioritised, be- 
cause the existence of the communication is concealed and only known to dedicated 
participants. If not the system is useless. This is followed by the capacity at medium pri- 
ority. A steganographic technique is examined as a method for communication. Hence, 
the flow of information in the communication considerably benefits from of a large 
amount of transferable information. The robustness of the system against manipulations 
of the picture has lowest priority and is often neglected [Mac97,Uph]. However, if the 
communication channel is disturbed (e.g. by changing the image format), the hidden 
information should be preserved in order not to cut off the communication. 

Independently from this global prioritisation, it is possible that in a steganographic 
system further conditions (e.g., user preferences) must be considered without changing 
the global prioritisation. 



3 Related Work 

Information can be embedded into images in many different ways. Fundamentally one 
can classify technical steganographic systems to belong to one of the following three 
groups: 

Image domain techniques hide the information by direct manipulations of the image 
domain. This class includes methods which use the least significant bits of each pixel for 
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embedding [Bro97,Rep00,Mac97], Generally, the image domain tools can reach a very 
high capacity, hut offer no robustness against smallest modihcations at the stego image. 

Transformation domain techniques are processes which manipulate algorithms or 
embed the information in the frequency domain [Uph,SZA96]. Techniques which in- 
volve a colour transformation to change certain image attributes (e.g., luminance values 
[SZT96]), can be considered to belonging to this group, too. These techniques are more 
robust than image domain tools and deliver a good image quality. Their large disadvan- 
tage is the low capacity, however. 

Hybrid techniques possess certain properties of both groups. Hybrid systems offer a 
good visual quality and are more robust against manipulations. The capacity is similar 
to transformation domain tools [JJ98,CKLS96]. 

However, non of the presented systems is able to react on changes of user preferences. 
There is no possibility to do an demand-weighting by slightly increase or decrease any 
desired priority depending on practical necessities. Only one system [RepOO] provides 
the facility to change the number of bits which can be embedded per pixel. 

All of the presented techniques consider the characteristics of the human visual 
system (HVS) while embedding. Many image domain tools take advantage of the effect 
of the Just Noticeable Differences (jnd) to conceal slight changes of colour information, 
but there is no system which considers the different types of images. Thereby, the jnd 
is not constant [Say96], depending the number of used colour components in image 
(average for grey scale images: 100 jnd) and the image information (noisy/uniform 
areas: 30/240 jnd). 



4 Perception of Luminance and Its Use for Steganography 



In section 3 it was described how the jnd can be exploited in steganography. This section 
introduces a new HVS-effect. 

Colours can be described by the three parameters luminance, hue and saturation. 
According to [Mal98], changes in luminance are perceived strongest. 

Colour-only embedding If changes in luminance are better perceptible than chang- 
es in chrominance information, the embedding function must preserve the luminance 
values of the picture and change only the hue and saturation information. 

If changing the colour information does not provide enough capacity, it may be 
necessary to change the luminance, too. In this case, a non-linear embedding strategy 
can lead to an improvement of invisible embedding as follows. In Fig. 2a it can be 
seen that the physically absorbed luminance does not depend linearly on the perceived 
luminance. Additionally, changes can not be perceived below a threshold. If one embeds 
the additional information in physical luminance values which are smaller than the visual 
threshold, changes are not visible. In Fig. 2b another relationship is shown. Here, the 
value of 1% is the threshold where changes become visible. Thus, for luminance values 
below 100, changes of one will result in visible distortions whereby at values above 100 
changes of the same magnitude can not be perceived [Pol99]. The higher the luminance 
values the higher the magnitude of possible changes until the modihcation is perceivable. 
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(a) Relationship between 
physically absorbed and 
perceived luminance [FM97]. 



(b) The different weighting of 
the perception of uniform 
luminance spacings. Weightings 
over 1% can be perceived. 



Fig. 2. Difference between physically absorbed and perceived luminance. 



Non-linear luminance embedding Luminance values which are very large (> 100) 
and values which lie below the perception threshold should be modified during embed- 
ding. 

5 Potential Cover Images 

The advantage of steganography is the free choice of a suitable cover image for the 
embedding. Through the multitude of possibilities which are offered by steganography 
in digital images, it appears difficult to give an universally valid statement for potential 
covers. In this section we will discuss which properties of a cover influence the suitability 
with respect to the main requirements of imperceptible embedding and high capacity. 

The entropy is often considered as the main criterion for selecting a suitable cover 
and is anymore used in this proposals. If a cover and the additional information emb 
are independent and stego is the resulting image from the embedding process, we can 
conclude that: 



H (stego) = H (cover) -I- H(emb) (1) 

Based upon this, there are two alternatives to conceal the presence of the embedded 
information from a possible attacker [AP98]. 

1 . The entropy of the embedded information H (emb) must be smaller than the un- 
certainty e about the entropy H (Cover) of the cover, when it is estimated by the 
attacker. 

2. The entropy of the cover must be reduced by some suitable means which maintain 
it’s visual appearance. 
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A potential problem with this approach is that the person who selects the cover image 
does not know how good an attacker can estimate H {cover) by examining the stego 
image. 



5.1 Potential Covers for Imperceptible Embedding 

For the selection of a cover image suitable for imperceptible embedding, we have to 
consider the image content from the viewpoint of HVS properties. 

In section 3 we have pointed out that the jnd depends on the image content, it is 
small in uniform regions and large in noisy areas. Thus, a noisy image suits better for 
embedding than a picture with many uniform areas, because modifications of the cover 
are less visible. 

Because noisy images in general include a large amount of colours and the entropy 
highly correlates with this value, it follows: for invisible embedding a high entropy of 
the cover must be demanded. 

It was mentioned that the entropy of the message should be smaller than the uncer- 
tainty e about the entropy of the cover. This uncertainty is difficult to model universally 
and is based on the possibilities and the general understanding of the attacker for the 
analysed image. However, depending on the cover one could try to estimate how large 
an e can be chosen. As proposed in the previous section we can also decrease the entropy 
of the cover to cover Red, so it follows: 

H {cover) — e < H {cover Red) + H{emb) < H {cover) -I- e (2) 

For steganography these limits for the estimation of H {cover) are for special interest. 
If an information was embedded with an entropy larger than e, this results in perceptible 
embedding and a violation of the main requirement. 

Furthermore from Eq. 2 it can be concluded, that the entropy H {emb) depend mainly 
on e and therefore from the uncertainty of the entropy^ of the cover. Thus a cover should 
be chosen, whose value ofe can be assumed as large as possible, i.e. it is hard to estimate 
the entropy of the cover. 



5.2 Potential Covers for Increased Capacity 

The channel capacity n defines the available bandwidth for transferring the image in- 
formation [RosOO] via an n-bit channel. The value n is indicated in Bit per pixel (bpp) 
and determines the precision of a quantised analogous signal. The entropy of images is 
usually smaller than the channel capacity. That’s why the following inequation can be 
derived [PAK99]: 



H{emb) < n — H {cover). (3) 

Therefore, the possibility to embed a message with a high entropy can be archived by 
two measures: 

’ A practical expression for t could be the value of the natural noise in cover. 
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1. The entropy of the cover image H (cover) is reduced suitably. 

2. The width n of the communication channel is increased. 

Both points can be influenced by the choice of a suitable cover image. 

By the free choice of cover images it is possible to use covers which present the 
natural image information with a low entropy and which possess a suitable entropy by 
the meaning of Eq. 3. Furthermore, if it is possible to decrease the amount of different 
colours in the picture^, the entropy decreases and an arbitrary cover can be used. From 
the point of view of a high capacity, a cover image with a low entropy has to be selected 
or the entropy of the cover must be reduced. 

In digital images the channel capacity n is determined by the choice of the used image 
format. With an increase of the channel capacity, in general the entropy H (cover) and 
its estimate H (cover) ± e can be more varied. Through the possibility of the reduction 
of H (cover), the meaning of n becomes of course weakened. For large e it plays ne- 
vertheless a decisive role, especially for scenarios with n < H (cover) + e, where the 
capacity then depends mainly on n. For covers with n ^ H (cover) + e, the reduction of 
the entropy can be skipped. In that case the size of n has no influence on the capacity of 
the process. Existing image formats with a large channel capacity (> 24) are therefore 
especially well suited for steganography. Because the available channel capacity of the 
image formats is used generally for the exact storage of image information^, the number 
of colours in such images approximates n in practice. An estimate for the entropy and 
the resulting e is therefore smaller in the GIF image format with n = 8 than in BMP 
image format with n = 24. 



5.3 Other Considerations 

Beside the discussed requirements for potential covers for the imperceptibility of em- 
bedding and a high capacity, further points have to be considered. Here, we just mention 
some of them briefly. For image formats these are, beside the channel capacity, the used 
colour systems and the lossy/lossless compression and storage of the image information. 

The dependence on the cover image and the used steganographic technique is also 
important. Each process has his own method to embed the data in the image. Accordingly, 
it must be distinguished, again from case to case, which is a suitable image for this 
technique. 

An often neglected question is the transfer of the stego image from the sender over 
a public channel to the receiver. Because the basic idea of steganography is the hidden 
communication, the stego image must not attract attention. It should be conform with 
a certain context and do not appear displaced. Therefore the selection of the cover‘d, 
according to its semantic image content, is as important as the preceding considerations. 



^ Such a reduction of entropy must consider the HVS. 
^ E.g. during a scan process. 

* The free choice of a cover here becomes limited. 




36 



R. Rosenbaum and H. Schumann 



6 An Adaptive Steganographic Framework 



In this section, we introduce a new hybrid steganographic approach. The presented 
framework is distinguished from previous approaches by the embedding strategy and a 
greater flexibility facing changes of the demands. This flexibility can be reached on two 
different levels: (1) On the level of the used systems and algorithms, and (2) on the level 
of a concrete implementation. The first point can be realised during the design phase 
and implementation of the steganographic system by exchanging of modules. On the 
level of a concrete implementation, changes are possible during the execution time of 
the system. 




----- Embedding 



' User interaction 



Fig. 3. Architecture of the proposed framework. 



Much steganographic systems use the approach to embed the additional information 
in specific bit positions (e.g. the LSB of a pixel) of the image. The receiver knows this 
bit-positions and can extract the message. The basic idea of the new approach is to 
use so called reference colours instead of bit-positions, which form the starting point 
for the embedding and extraction. These reference colours are chosen such that if the 
cover image is described only by these colours, there are no visual differences to the 
original cover image. Starting from an image that is normalised in this way, parts of 
information can be embedded into it by small changes of the reference colours. Here the 
discussed effect of just noticeable differences is used. The extraction itself is based on 
the comparison between the reference colours and the colour information of the stego 
image. Therefore, sender and receiver must share the same reference colour information 
to extract the information correctly. In image domain the unequivocal assignment and 
decision of membership for pixel values to parts of the additional information is based 
on a Pseudo-Noise-Function^. The used cover, the reference colours of the cover and 

^ Implementation of Kerkhoff’s principle [Sim83]. 
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the parameters, which steer the random function can be suitably chosen depending on 
the existing requirements. 

The presented framework consists of modules which can be chosen according to 
the suitability of components for embedding and extracting (Fig. 3). According to the 
properties of the individual requirements (cf. section 2) a suitable candidate can be sel- 
ected for each codec step. To embed a message, we first have to chose a suitable cover 
(module 1) which depends on the user requirements and is based on the prepositions 
given in section 5. After that, we transform this cover in a more suitable colour system 
(module 2) and select the reference colours (module 3). Starting from these colours, the 
additional information is embedded (module 4), and the colour system of the encoded 
image is transformed inversely (module 2). The extraction proceeds similarly to embed- 
ding. First the same colour transformation as during embedding is applied to the stego 
image (module 2) and the same reference colours must be extracted (module 3). Then, 
the message can be extracted (module 4). 

In the following example, a concrete implementation of the framework, considering 
main- and ancillary requirements and presenting suitable modules, is shown. 

6.1 Example 

Main requirements: Special attention is payed to the enforcement of invisible embedding 
as a basic. The capacity is prioritised lower and the robustness is considered only in 
scenarios which are defined in fhe ancillary requirements. 

Ancillary requirements: far-reaching independence from the cover image formats 
and exchanges between the formats. 

Cover image selection. Based on the main requirement for imperceptibility of the 
message and the derived statements for potential covers, a suitable image has to be 
chosen. This can be done by searching in a image database with the given criteria. The 
candidates found can be shown to the user who selects the image to be used as a cover. 



The transformation into a suitable colour system. As colour system for the embedding 
here the YUV colour system was chosen. The YUV colour system has opposite to other 
colour systems like RGB some decisive advantages. The three channels have a different 
perceptual weighting and allow an embedding adjusted to the human visual perception. 
With a linear system of equations, the colour information can easily be transformed 
between the YUV and the RGB colour system®. 



Finding the reference colours. Because there is the necessity to code into each given 
image, the reference colours must be generated dynamically for each individual image. 
To additionally increase the capacity of the process, the entropy of a chosen cover image 
may be suitably reduced before the embedding process will start. This will achieved here 
by the reduction of the number of colours in the image. To reach this, we are exploiting 
colour quantisation. 

® The RGB colour system is widely used and here supposed as the colour system of the cover. 
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The problem of all quantisation algorithms is the quantisation error Err quant emer- 
ging in the picture, if I colours c, (1 < z < 1), which appear respectively d{ci) times, are 
reduced to k colours qj (1 < j < k,k < I ). This can be accomplished by partitioning 
the colour space into k subsets Cj , whereby for each subset a representative colour qj is 
chosen. The resulting magnitude of this error greatly influences the quality of the stego 
image: 



d{ci){ci - qjY ■ if Ci G Cj 
0 : else 

As distance measure between Ci and qj the mean- square-error is used. 

The colours qj generated by the colour quantisation of the cover image are used 
as reference colours for embedding emb. These are generally all colours which appear 
directly after the quantisation in the image. Thereby the number k of the colours stands 
in direct relationship to the parameter e introduced in Eq. 2 as the value of uncertainty 
about the entropy estimation. It holds: the smaller e, the smaller the difference I — k 
must be chosen. 



Err quant — EE 



The embedding of additional information. At this point the picture still consists only 
of reference colours, which are now the starting point for embedding. Because we want 
to exploit the visually low prioritised chrominance channels, the YUV-colour system is 
used. Here, the embedding algorithm considers only one channel and works as follows: 

1 . Consider the chrominance value of a given pixel. 

2. Read a bit from emb. 

3. To embed a “0”, decrease the chrominance value of the pixel by one. 

4. To embed a “1”, increase the chrominance value of the pixel by one. 

5. Go to the next pixel. 

It can be assumed that these slight changes of the chrominance values are smaller than a 
given jnd. It is not perceptible for the human eye and corresponds to the derived demands 
for an invisible embedding. If we consider the embedding from the point of view of the 
embedding error Err emb in the stego image of width w and height h, it emerges by 
embedding of e bits as: 



wxh 

Err emb = '^{Pi - = e (5) 

i=l 

Because Err emb was not applied directly to the original image, it can not be fully added 
to the overall error Err overall, which is calculated as the difference between the cover 
and the stego image. Since each colour value may be increased or decreased during both 
quantisation and embedding, there are 4 cases possible. In two of the cases Err quant is 
reduced. If all possible cases are equally probable, the error Err emb can be neglected. 
From the point of view of the possible size of the additional information which can be 
embedded with this procedure, a capacity C of Ibpp emerges. 
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To make the reference colours accessible for the extraction process, a dictionary 
is used, which is coded directly into the stego image by leaving first visited reference 
colours unchanged. Therefore it requires no further transmission channels. The visual 
quality and the embedding error are not influenced by the transfer of the dictionary. 

After the inverse transformation to the starting colour system, the stego image is 
ready for transmission. 



The extraction of additional information. If the receiver gets the stego image genera- 
ted by the sender, it must first be transformed. The extraction function itself consists of 
two stages. At first the dictionary with the reference colours is recovered from the stego 
image. Based upon this, the additional information is extracted, as follows: 

1 . Consider a given pixel. 

2. If the colour value of the pixel is not yet in the dictionary and the chrominance value 
does not vary by the value “1” from the chrominance value of a colour value in the 
dictionary, put this colour value in the dictionary. Go to the next pixel. 

3. If the chrominance value of the pixel colour varies by the value “1” from the chro- 
minance value of a colour value in the dictionary, associate this pixel colour to the 
found reference colour. 

4. If the chrominance value of the given pixel is smaller than the chrominance value 
of the associated reference colour, the next bit of embedded information is “0”. 

5. If the chrominance value of the given pixel is larger than the chrominance value of 
the associated reference colour, the next bit of embedded information is “1”. 

6. Go to the next pixel. 

After the end of the algorithm the additional information is extracted and the message 
was transferred. 

6.2 Runtime Adaptivity 

By the use of reference colours as reference points for extraction, different modifications 
and adjustments can be realised at runtime of the proposed steganographic system. This 
affects all three demands on such systems, as follows: 

Imperceptibility: In the previous section, cases were discussed where during the em- 
bedding a reduction of Err quant can result. To exploit this, the embedding algorithm 
can be modified fo enforce a coding info the direction of the original colour value. 
Thus, the overall error can be decreased to Err overall = Err quant — Erremb- 
Capacity: The described coding of individual bits as parts of the message supplies the 
smallest possible capacity oflbpp. Through the use of larger steps in the deviation a 
from associated reference colours, it is possible to code groups of bits instead of a 
single bit per pixel. So a capacity of ((log 2 a) + l)bpp can be reached theoretically, 
at the cost of a decreased imperceptibility. 

Robustness: The largest loss of information by application of lossy image compres- 
sions occur to chrominance channels. Here the luminance channel can be used for 
embedding. The emerging stronger visual distortions can be decreased by using the 
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H VS -based methods discussed in section 4. Here only suitable luminance values are 
used for embedding. Pixels with other luminance values are skipped, which results 
in a decreasing capacity. 

These adjustments can be applied to the system without a re-design at runtime by chan- 
ging the priority values assigned to the three global demands imperceptibility, capacity 
and robustness. 




Fig. 4. The example implementation delivers good results for some types of cover images. 



6.3 Experimental Results 

This section examines the discussed requirements to steganographic systems using quan- 
titative measures. Except the entropy (flag: 0.758), the values of the three different images 
in figure 4 have been averaged to the given results. 



Table 1. Measurements for the estimation of invisible embedding and capacity. 



Message length 


% of image information size 


SNR (db) 


MSB 


CTT max 


Entropy 


261 bytes 


0,07 


33.93 


71,08 


71 


0,35 


4510 bytes 


1,13 


34.35 


64,71 


71 


0,41 


16,31 Kbytes 


4,18 


35.85 


46,45 


69 


0,44 


32,62 Kbytes 


8,36 


35.70 


47,47 


69 


0,49 


48,93 Kbytes 


12,55 


33.55 


77,20 


63 


0,51 


65,24 Kbytes 


16,73 


30.48 


155,66 


72 


0,49 



In reference to the invisible embedding the results are listed in table 1 . These com- 
paratively low distortions in the stego image indicate that the technique meets the main 
requirement^. Beside the use of an effective quantisation algorithm, these results are due 
to the selection of the respectively best parameters for embedding (cf. table 2). At low 
message lengths, the quantisation error affects largely the magnitude of the SNR and 

’ In [RosOO] more detailed measurements and comparisons to other steganographic techniques 
are given. 
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Fig. 5. The resulting difference after embedding (normalised and inverted). 



MSB. By the reduction of the colours there are large changes of the entropy, too. Remar- 
kably, the SNR increases with a suitable increase of the message size, since Err quant 
is compensated by the embedding as discussed in section 6.2. The coding of the infor- 
mation into the direction of the original colour value can therefore achieve better visual 
results. At message length; > IQKB the SNR and MSB however are rising again. The 
reason for this is the necessary capacity increase at the cost of decreased imperceptibi- 
lity, causing "over-compensation" of Err quant by the new colour values departing again 
from the original ones. If we choose a possibility for embedding that has a focus on a 
higher capacity (65kb), we will have the strongest distortions. As the subjective image 
quality (cf. fig. 4/5) there are no significant perceptible distortions. This is once again a 
result of a good working quantisation and the exploitation of the jnd. 



Table 2. Parameters for steering the embedding function. 



Message length 


Channel 


Focus of coding 


Capacity of coding 


261 bytes 


Chrominance 


visual quality 


Ibpp 


4510 bytes 


Chrominance 


visual quality 


Ibpp 


16,31 Kbytes 


Chrominance 


visual quality 


Ibpp 


32,62 Kbytes 


Chrominance 


visual quality 


2bpp 


48,93 Kbytes 


Chrominance 


visual quality 


3bpp 


65,24 Kbytes 


Chrominance 


high capacity 


4bpp 



With the new approach it is possible to achieve very large capacities. By a suitable 
choice of the parameters (cf. table 2), depending the size of the message, a gradual 
increase of the capacity at good image quality will be reached. In the examples imple- 
mentation, almost 1 /6 of the channel capacity can be used for the additional information. 
For pictures in GIF image format, it can be shown that a value of 1/2 can be reached. 

Because the new approach is independent from specific image format algorithms, 
it behaves robustly with respect to the exchange between image formats as long as 
image information is stored losslessly.^ With it, the communication is not truncated after 

* If one considers the limitation of the number of colours during the embedding process this 
includes palette image formates, like GIF. 
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changing the image format. This was possible because the dictionary and the message 
parts are stored only in image colour information. On a lossy wavelet image compression 
scheme, which is better scalable than the JPEG compression scheme, it can be shown that 
there is a significant break in the bit error rate after applying a compression ratio which 
is higher than a certain threshold. Because this threshold is less than every tested JPEG 
compression ratio, the new approach is not robust facing JPEG lossy compression.® The 
changes in image information affect the colour values as well as the dictionary, such that 
there is no possibility to extract the message correctly. However, it is possible to handle 
lossy image compression if the compression ration is less than the threshold. 



6.4 Attacks on the New Approach 

Beside the good results facing the invisible embedding, additionally a possible use of 
electronic attacks must be considered. Through the process of embedding, there is a 
clustering of colours in the YUV colour space. Colours which contain a part of the 
message are located around the corresponding reference colour. On high capacities the 
clustering is stronger. If the GIE image format was used for embedding, the clustering is 
very strong and a signature in the colour palette can be detected. Statistical attacks, e.g. 
that proposed in [WP99], can be able to detect the clustering of colours also. To prevent 
such attacks, special covers (e.g. with many similar colours) and a slower capacity can be 
used, so that the perfidious clusters become regular. It is also possible to use only some 
pixels and leave others in original colour. This proceeding disintegrates the clusters and 
results mostly in the failure of cluster tests. 



7 Conclusion and Future Work 

Many steganographic techniques can not suffice to the confronted requirements. These 
demands stand in contrary relation and must be prioritised depending also on additional 
ancillary requirements. In this work the prioritisation was accomplished from the view- 
point of steganography. Based upon this, guidelines for the selection of suitable cover 
images, depending on the particular requirements, could be derived. With an advanced 
model for human visual perception it would be possible to improve these guidelines or 
even to automate the selection process. At this time a model for the HVS is the research 
term of different continuing projects. Imperceptible embedding could be achieved by 
exploiting the lower sensitivity of the H V S for chrominance components of colour ima- 
ges. Eurthermore, we have discussed opportunities for imperceptible embedding into the 
luminance component based on HVS properties. 

In order to react as flexibly as possible to changes in ancillary requirements of a 
communication, a framework for a steganographic system was presented using a concrete 
example. Our method is based on the use of so called reference colours for embedding and 
extraction and uses the derived statements for potential covers. Related to the prioritised 
demands, very good results could be achieved. Especially at ratios of 3-10% between 
the message size and the image information size, a very good objective image quality 

® Also after applying the described adaption. 
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is reached. Thereby the example implementation provides a capacity of up to 4 bpp. 
Because the largest distortions are introduced by quantisation and this delivers only 
slight visual distortions in the discussed example, the subjective image quality of the 
stego images is also very good. Cover and stego image are distinguishable only after 
the application of a strong quantisation and by a direct comparison. Our approach is not 
robust against the lossy JPEG compression, but a exchange of the stego image in image 
formats which allow lossless compression is possible. 

In this work an example for the concrete implementation of the framework was 
presented which consists of individual interchangeable modules. If one wants to use the 
framework with different ancillary requirements e.g., it will be necessary to exchange 
the modules with others, which are designed to consider the conditions and demands 
for the employment of the modified system. One major direction for future works is the 
development of modules which are more robust with respect to lossy image formats. 
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Abstract. Two robust spatial-domain watermarking algorithms for im- 
age copyright protection are described in this paper. The first one is 
robust against compression, filtering and cropping. Like all published 
crop-proof algorithms, the one proposed here requires the original image 
for mark recovery. Robustness against compression and filtering is ob- 
tained by using the JPEG algorithm to decide on mark location and 
magnitude; robustness against cropping is achieved through a repetition 
code. The second watermarking algorithm uses visual components and 
is robust against compression, filtering, scaling and moderate rotations. 



1 Introduction 

Electronic copyright protection schemes based on the principle of copy preven- 
tion have proven ineffective or insufficient in the last years (see [9], [10]). The 
recent failure of the DVD copy prevention system [13] is just another argument 
supporting the idea that electronic copyright protection should rather rely on 
copy detection techniques. Watermarking is a well-known technique for copy 
detection, whereby the merchant selling the piece of information {e.g. image) 
embeds a mark in the copy sold. 

In [12], three measures are proposed to assess the performance of information 
hiding schemes, which are a general class including watermarking schemes: 

Robustness: Resistance to accidental removal of the embedded bits. 
Capacity: The amount of information that may be embedded and later on 
recovered. 

Imperceptibility: Extent to which the embedding process leaves undamaged 
the perceptual quality of the covertext (the copyrighted information being 
traded with). 

* This work is partly supported by the Spanish CICYT under grant no. TEL98-0699- 
C02-02. 



J. Pieprzyk, E. Okamoto, and J. Seberry (Eds.): ISW 2000, LNCS 1975, pp. 44—53, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 




Spatial-Domain Image Watermarking 



45 



Commercial watermarking schemes surviving a broad range of manipulati- 
ons {e.g. Digimarc [3]) tend to be based on proprietary algorithms not available 
in the literature. Published watermarking algorithms can be divided into obli- 
vious and non-oblivious; the first type does not require the original image for 
mark reconstruction (for example, see [5,6]), while the second type does. To our 
best knowledge, no published oblivious scheme can survive arbitrary cropping 
attacks. On the other hand, a large number of proposals operate on transformed 
domains (DCT, wavelet) rather than on the spatial domain. Spatial domain wa- 
termarking is attractive because it provides a better intuition on how to attain 
an optimal tradeoff between robustness, capacity and imperceptibility. Thus, co- 
ming up with public spatial domain algorithms which survive a broad range of 
manipulations is an important issue. 

1.1 Our Results 

We present in this paper two watermarking schemes which are robust against a 
very broad range of transformations. Both schemes operate in the spatial domain 
and are designed for image copyright protection. Their features are as follows: 

— The first algorithm is based on the ideas of [8], but offers greater simplicity 
{e.g. visual components are not used), capacity (more pixels can be used to 
convey mark bits) and robustness than [8] . Transformations survived include 
compression, filtering and cropping. The new scheme also improves on the 
earlier version [15] both in imperceptibility (higher signal-to-noise ratios) 
and robustness (cropping is now survived). 

~ The second algorithm uses visual components, image tiling and mark redun- 
dancy to survive compression, filtering, scaling and moderate rotations. 

Section 2 describes the crop-proof watermarking algorithm. Section 3 descri- 
bes the scale-proof watermarking algorithm. Robustness of both algorithms is 
discussed in Section 4. Section 5 lists some conclusions and topics for further 
research. 



2 Crop-Proof Watermarking 

The scheme can be described in two stages: mark embedding and mark recon- 
struction. Like all practical schemes known to date, the following one is symme- 
tric in the sense of [11]: mark embedding and recovery are entirely performed by 
the merchant M . 

For mark embedding, we assume that the image allows sub-perceptual per- 
turbation. Assume that g is a JPEG quality level chosen in advance by the 
merchant M; g will be used as a robustness and capacity parameter. Also, let 
p be a Peak Signal-to-Noise Ratio (PSNR,[7]) chosen in advance by the mer- 
chant; p will be used as an imperceptibility parameter, i.e. M requires that 
the embedding process does not bring the PSNR below p dB. Let be 

a random bit sequence generated by a cryptographically sound pseudo-random 
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generator with secret key k only known to M . An image X will be represented 
as A = {xi : 1 < j < n}, where n is the number of pixels and Xi is the color level 
of the z-th pixel. For monochrome images, n = w x h, where w and h are, respec- 
tively, the width and height of the image. For RGB color images, n = 3(w x h) 
(there are three matrices rather than one). 

Algorithm 1 (Mark embedding(p,q)) 



1. Compress X using the JPEG algorithm with quality q as input parameter. 
Call the bitmap of the resulting compressed image X' . Let 6i := x, — x[ he 
the difference between corresponding pixels in X and X' . Only positions i 
for which 5i 0 will be useable to embed bits of the mark. 

2. Call e the mark to he embedded. Encode e using an error- correcting code 
(ECC) to obtain the encoded mark E; call \E\ the hit-length of E. Replicate 
the mark E to obtain a sequence E' with as many bits as pixels in X with 

6i 7^ 0. 

3. Let j := 0. For i = 1 to n do: 

a) If Si = 0 then x'f := Xi. 

b) If Si 0 then 

i. Let j := j -\-l. Compute si := el 0 Sj, where el is the j-th bit of E' . 

The actual bit that will be embedded is si. 
a. If si = 0 then compute x'f := Xi — Si. 

in. If si = 1 then compute x'f := Xi-\- Si. 

4. While PSNR{X"\X) < p do 

a) Randomly pick an index i such that 1 < i < n. 

b) If x'l — Xi > 3 then x" := x" — 1. 

c) If x'l — Xi < — 3 then x'l := x" 0 1. 

X" = {x'f : 1 < i < n} is the marked image, which yields at least PSNR 

{X"\X) = p dB (PSNR of X" with respect to X). The influence of q on capacity 

and robustness is discussed below. The use of the value 3 when adjusting the 
PSNR is empirically justified: this is the minimal magnitude that reasonably 
survives the attacks considered in the next section. In addition to PSNR, qua- 
lity metrics such as the ones in [4,7] can be used to measure imperceptibility 
of the mark; if such complementary measures are not satisfactory, then re-run 
Algorithm 1 with a higher q. 

For mark reconstruction, knowledge of X and the secret key k is assumed (fc 
is used to regenerate the random sequence knowledge of the original 

mark e is not assumed, which allows to use the proposed algorithm for finger- 
printing (some collusion-security can be achieved using dual binary Hamming 
codes as ECC, see [15]). Using X for mark reconstruction is a common feature 
in all published crop-proof watermarking algorithms; on the other hand, it is 
not a serious shortcoming in symmetric marking algorithms, where mark recon- 
struction is performed by and for the merchant M; moreover, for still images, 
one can consider that the secret key is {k,X). 
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Algorithm 2 (Mark reconstruction(q)) 



1. Upon detecting a redistributed item X , restore it to the bitmap format. 

2. Compress the corresponding original X with quality q to obtain X' . 

3. Let ones[-] and zeroes[-] be two vectors with |iJ| integer positions initially all 
set to 0. 

4- Let j := 0. For i = 1 to n do: 

a) Compute 5i := Xi — x^. 

b) If Si 0 then 

i. Let j := j -I- 1. If j > \E\ then j := 1. 

ii. Compute Si := Xi — Xi- 

in. If Si = 0 then Sj := ff, where ff denotes erasure. 

iv. If Si X Si > 0 then Sj := 1. 

V. If Si X Si < 0 then Sj := 0. 

vi. If Sj ^ ff then e' j := Sj © Sj; otherwise e'j := ff. 

vii. If eJ j = 1 then ones[j] := one.s[j] + 1. 
via. If e' i = 0 then zeroes\j] := zeroes\j] + 1. 

5. For j = lto \E\ do: 

a) If ones[j] > zeroes[j] then Cj := 1, where ej is the j-th bit of the reco- 
vered mark E. 

b) If ones[j] < zeroes[j] then ej := 0. 

c) If ones[j] = zeroes[j] then tj := ff. 

6. Decode E with the same ECC used for embedding to obtain e. 

Note that the redistributed X may have width w and height h which differ 
from w and h due to manipulation by the re-distributor; this would cause the 
number of pixels h in X to be different from n. In Section 4, we discuss how to 
deal with attacks altering n. 

3 Scale-Proof Watermarking 

Like the previous one, this scheme operates in the spatial domain and is a sym- 
metric one. Again, we assume the original image is represented as A = {xi : 
I < i < w X h}, where Xi is the color level of the i-th pixel and w and h are, 
respectively, the width and height of the image. For RGB color images, mark 
embedding and reconstruction is independently done for each color plane. 

We next propose an algorithm for computing the visual components of the 
pixels in the image, that is their perceptual value. The idea underlying Algo- 
rithm 3 is that dark pixels and those pixels in non-homogeneous regions are 
the ones that can best accomodate embedded information while minimizing the 
perceptual impact. 
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Algorithm 3 (Visual components) 



1. For i = 1 to n do: 

a) Compute mi := maxj \xi — Xj|/4, for all pixels j which are neighbors of 
pixel i on the image (there are up to eight neighbors); mi can be regarded 
as a kind of discrete derivative at pixel i. To bound the value of mi 
between 1 and 4, perform the following corrections: 

i. If rrii > 4 then rrii := 4. 
a. If mi = 0 then := 1. 

b) Compute the darkness of the i-th pixel as di := (70 — +4/70 if Xi < 70 
and di := 0 otherwise. We consider a pixel to be dark if its color level is 
below 70. The value of di lies between 0 and 4- 

c) Compute the preliminary visual component of the i-th pixel as Vi := 
max( mi, di). 

2. For i = 1 to n compute the final visual component of the i-th pixel as Vi := 
maxj Vj, for all pixels j which are neighbors of i on the image plus pixel i 
itself. 

Mark embedding is based on visual components and encrypts the mark bits 
using a random bit sequence {.Si}i>i generated by a cryptographically sound 
pseudo-random generator with secret key k only known to M. 

Algorithm 4 (Mark embedding(p,r)) 



1. Divide the image into the maximum possible number of square tiles ofp pixels 
side, so that there is a r pixels wide band between neighboring tiles (the band 
separates tiles). Let q be the number of resulting tiles. Each tile will be used 
to embed one bit, so q is the capacity of this watermarking scheme. 

2. Call e the mark to be embedded. Encode e using an error- correcting code 
(ECC) to obtain the encoded mark E. If \E\ is the hit-length of E, we must 
have \E\ < m. Replicate the mark E to obtain a sequence E' with q bits. 

3. For i = 1 to q compute s) = e' © Si, where e' is the i-th bit of E' . 

4 . To embed the i-th encrypted mark bit s' into the i-th tile do: 

V s'i = 0 then x( := Xj — Vj for all pixels Xj in the i-th tile, 
b) // s' = 1 then x( := Xj + Vj for all pixels Xj in the i-th tile. 

X' = {x) : 1 < i < w X h} is the marked image. The band between tiles is 
never modified and it helps to avoid perceptual artifacts that could appear as a 
result of using two adjacent tiles to embed a 0 and a 1. The use of a range 1 to 4 
for visual components is empirically justified: an addition or subtraction of up to 
4 to the color level can hardly be perceived but at the same time survives most 
subperceptual manipulations. Regarding parameters p and r, we recommend to 
use p = 5 and r = 3 as a tradeoff between capacity — which would favor tiles 
as small as possible and intertile bands as narrow as possible — , robustness — 
the larger a tile, the more redundancy in bit embedding and the more likely is 




Spatial-Domain Image Watermarking 



49 



correct bit reconstruction — and imperceptibility — the wider a band, the less 
chances for artifacts. 

The assumptions for mark reconstruction are identical to those made for the 
scheme of Section 2, namely knowledge of X and k (to regenerate the random 
sequence No knowledge of the original mark e is assumed, so that the 

proposed scheme is also useable for fingerprinting. Let X be the redistributed 
image, and let w and h be its width and height. 

Algorithm 5 (Mark reconstruction(p,r)) 



1. Let ones[-] and zeroes[-] be two vectors with \E\ integer positions initially all 
set to 0. 

2. From the length p of the tile side, the width r of the intertile band and X, 
compute the estimated number of tiles q. 

3. For t = 1 to q do: 

a) Let u := 1 + {{t — 1) mod |£'|) 

b) For each pixel in the t-th tile of the original image X do: 

i. Let i and j be the row and column of the considered original pixel, 
which will he denoted by Xij . 

ii. Locate the pixel Xab in the marked image X corresponding to Xij . To 
do this, let a := i x h/h and b := j x w/w. 

Hi. Compute 6ij := Xab — Xij. 

iv. If 5ij > 0 then ones[u] := ones[u] + 1. 

V. If Sij < 0 then zeroes[u] := zeroes[u] -I- 1. 

4- For u = 1 to \E\ do: 

a) If onesu > zeroeSu then Su '■= 1, where is the recovered version of 
the u-th embedded bit. 

b) If ones u < zeroeSu then := 0. 

c) If ones u = zeroeSu then Su := #, where ff denotes erasure. 

d) If Su ^ ff then Cu ■= Su ©s„ otherwise Cu := #, where e„ is the u-th bit 
of the recovered mark E. 

5. Decode E with the same ECC used for embedding to obtain i. 

4 Robustness Assessment 

The two schemes described above were implemented using a dual binary Ham- 
ming code DH{31,5) as ECC. The base test of the StirMark 3.1 benchmark [9] 
was used to evaluate their robustness. The following images from [14] were tried: 
Lena, Bear, Baboon and Peppers. A 70-bit long mark e was used for both sche- 
mes, which resulted in an encoded E with |£'| = 434. 
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4.1 Robustness of the Crop-Proof Scheme 

Using the scheme of Section 2, the percent of the n pixels that can be used 
to convey a mark bit (those with 6i ^ 0) ranges between 78.9% for q = 90% 
and 90.7% for q = 10% (there are slight variations between images). Values 
p = 38dB, q = 60% were tried on the above images, and the following StirMark 
manipulations were survived by the embedded mark: 

1. Color quantization. 

2. All low pass filtering manipulations. More specifically: 

a) Gaussian filter (blur). 

b) Median filter (2 x 2, 3 x 3 and 4x4). 

c) Frequency mode Laplacian removal [1]. 

d) Simple sharpening. 

3. JPEG compression for qualities 90% down to 30% (to resist 20% and 10% a 
lower value for q would be necessary). 

4. Rotations with and without scaling of —0.25 up to 0.25 degrees. 

5. Shearing up to 1% in the X and Y directions. 

6. All StirMark cropping attacks. These are resisted thanks to the mark being 
repeatedly embedded in the image. To resynchronize, keep shifting the crop- 
ped X over X until the the “right” relative position of V on X is found. 
The right position is estimated to be the one that, after running Algorithm 2 
on X and the corresponding cropping of X, yields the minimal number of 
corrected errors at Step 6. 

7. Removal of rows and columns from the marked image X" was automatically 
detected, dealt with, and survived exactly like cropping attacks. 

Additional rotation, scaling and shearing StirMark attacks can be detected and 
undone by M prior to mark reconstruction by using computer vision techniques 
to compare with the original image. The really dangerous attacks for the scheme 
presented here are random geometric distortions and attacks combining several 
of the aforementioned elementary manipulations. Figure 1 tries to show that 
marking is imperceptible. Extreme compression and cropping attacks for which 
the mark still survives are presented in Figure 2. 

An additional interesting feature of the presented algorithm is that multiple 
marking is supported. For example, merchant M\ can mark an image and sell 
the marked image to merchant M 2 , who re-marks the image with its own mark, 
and so on. As an example, if ten successive markings are performed on Lena 
each marking with p = 38dB, the PSNRs with respect to the original image 
decrease as follows: 38, 35.1, 33.4, 32.28, 31.37, 30.62, 30.04, 29.5, 29.06 and 
28.7. It is worth noting that multiple marking does not reduce the robustness of 
each individual marking. The result of five successive markings is presented in 
Figure 3. 

4.2 Robustness of the Scale-Proof Scheme 

Using the scheme of Section 3 with p = 5 and r = 3, the following StirMark 
manipulations were survived by the embedded mark: 
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Fig. 1. Left, original Lena. Right, Lena after embedding a 70-bit mark using the scheme 
of Section 2 with q = 60% and PSNR=38dB 




Fig. 2. Scheme of Section 2. Left, marked Lena after JPEG 10% compression. Right, 
marked and cropped Lena. The mark survives both attacks. 




Fig. 3. Scheme of Section 2. Lena after five successive markings. 
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1. Color quantization. 

2. Most low pass filtering manipulations. More specifically: 

a) Gaussian filter (blur). 

b) Median filter (2x2 and 3x3). 

c) Frequency mode Laplacian removal [1]. 

d) Simple sharpening. 

3. JPEG compression for qualities 90% down to 20% ( for some images down 
to 10%). 

4. Rotations with and without scaling of —0.25 up to 0.25 degrees. 

5. Shearing up to 1% in the X and Y directions. 

6. Cropping up to 1%. 

7. Row and column removal. 

8. All StirMark scaling attacks (scale factors from 0.5 to 2). 

Only small croppings are resisted because each tile embeds one bit of the mark; 
so, even if an ECC is used, we need at least \E\ tiles to be able to recover the 
mark. On the other hand, scaling is resisted because a mark bit is embedded in 
each pixel of a tile; even if the tile becomes smaller or larger, the correct bit can 
still be reconstructed. Extreme compression and scaling attacks for which the 
mark still survives are presented in Figure 4. 




Fig. 4. Scheme of Section 3. Left, marked Lena after JPEG 15% compression. Right, 
marked Lena after 50% scaling. The mark survives both attacks. 



5 Conclusion and Future Research 

The two watermarking schemes presented in this paper operate in the spatial 
domain and are thus simple and computationally efficient. Both proposals behave 
well in front of compression and low-pass filtering. The most prominent feature of 
the first scheme is that its marks survive cropping attacks; like all publicly known 
crop-proof schemes, the one presented here requires the original image for mark 
reconstruction. The second scheme presented in this paper has the advantage 
over the first one of resisting scaling attacks; however, its tiling approach does 
not allow to survive substantial croppings. 
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It is probably impossible to come up with a public- domain watermarking 
algorithm (i.e. one that satisfies Kerckhoff’s assumption of being entirely known) 
that can resist all kinds of manipulations. Therefore, future research will be 
devoted to combining the two schemes presented in this paper between them and 
with other schemes in the literature (one possibility is to use a different scheme 
for each color plane) . We believe that surviving a broad range of manipulations 
will only be feasible through combined use of watermarking algorithms based 
on a variety of principles. In fact, one could imagine an expert system which 
proposes a particular combination of watermarking methods depending on the 
robustness requirements input by the user. 
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Abstract. Watermarking typically involves adding one sequence to ano- 
ther to produce a new sequence containing hidden information. This me- 
thod is based on modifying a distribution, obtained via segmentation, 
to hide information, thus allowing subsets of the distribution to still po- 
tentially reveal the watermark. We use vector quantization to segment a 
colour image and then shift each distribution by a random amount. The 
result is that we can detect a watermark without pre-processing because 
only comparisons between distributions are needed. 



1 Introduction 

The area of digital watermarking is currently being extensively researched be- 
cause of the commercial interest in protecting multimedia. Duplication of digital 
information is a fairly simple task because of the nature of the digital domain. 
Hence, the owners of multimedia content are keen to be able to demonstrate 
their ownership in the event of copyright infringement. Robust watermarking 
provides a solution to this problem by inserting a watermark, W, into the ori- 
ginal content, I, to produce I'. This is done so that at a later stage, any person 
who has used I' as the basis of their work can be shown to have W as part of 
their work and thus the owner of I will be credited with the original work. 

Many different techniques have been developed in the area of robust water- 
marking, all with the goal of correctly identifying ownership. The qualities of 
a good watermark are extensive and apart from the ability to resist intentional 
erosion, they include: robustness to compression; invisibility; and resistance to 
geometrical attacks [12]. Current watermarking techniques are able to withstand 
compression to a fair degree, are perceptually invisible and are able to tolerate 
some forms of geometrical attack. Some examples of this are [7], [10] and [9]. 

StirMark, written by Fabien A. P. Petitcolas and Markus G. Kuhn [8] is a 
program which applies a broad range of attacks and was written to test the limits 
of watermarking schemes. It demonstrates that there are significant problems in 
developing a watermarking technique which correctly proves ownership in every 
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situation. Usually, if an image is not presented in a form suitable for detection, 
then it comes as no surprise that the watermark cannot be traced. In this way, 
if an image has been cropped to one quarter of its original size, then the image 
under suspicion must be placed in its relative position inside the context of the 
original image in order for detection to proceed. This is clearly demonstrated in 
a paper written by Cox et al [5], where the cropped portion is placed centrally 
inside the original image, to maintain the series of frequencies. This modification 
of the test image before detection is performed is known as pre-processing. 

Whilst StirMark contains many techniques for performing attacks which try 
to obscure the presence of the watermark, or at least make it unpresentable 
without pre-processing, at the heart of the program is the StirMark attack itself. 
This attack corrupts an image by subtly distorting the grid on which pixels 
fall and interpolating to form a new image. The reason why this attack is so 
successful is that many watermarking schemes require the pixel values to be 
fairly consistent. This attack actually causes very little perceptual distortion but 
the pixel values are changed so significantly that many watermarking schemes 
cannot recover the watermark afterwards. 

Because there are many geometrical attacks which (directly or indirectly) 
take the form of a presentation attack [11], we decided to try and develop a 
watermarking method whereby the presentation of an image does not need to 
be identical to its original form. This paper is mainly intended as a proof of this 
concept. We propose a technique with the objective of allowing the recovery of a 
watermark from a colour image, even if the image is not in its original form. For 
example, if the image has been cropped, or rotated at an angle that is irregular, 
our method has been devised to attempt to recover it, without any form of pre- 
processing. Our method is based on the idea of extracting regions from an image 
and modifying their distributions in a recoverable fashion. To do this, we aimed 
to insert the watermark by using information gleaned from the surrounding pixels 
or neighbourhoods. The effect of doing this is that even though the regions that 
have been used for insertion may not be fully recoverable, as long as the image 
itself is similar in content to the original form, then the watermark should be 
present at some level. Therefore, attacks such as cropping or scaling should have 
little effect and even a method such as StirMark potentially will not tamper 
with the watermark beyond its ability to be recovered, as the content of the 
watermark is bound to the semantic content of the image. 

The watermarking method is described in section 2 and the results are shown 
in section 3. These are analyzed in section 4 and then conclusions are drawn in 
section 5. 

2 The Watermarking System 

The initial problem was that if we insert a watermark, various underlying as- 
sumptions are made at the point of recovery. Firstly, most watermarking schemes 
implicitly use the idea of reference points. With Cox’s method, if an image has 
been cropped or translated, then it must be re-set, back into the original image. 
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replacing the relevant portion. The watermark is not arbitrarily re-entrant - it 
must be detected in exactly the same location at which it was inserted. The 
watermarking algorithm cannot begin anywhere but expects a certain structure 
which must be maintained and this where the first part of pre-processing is re- 
quired. The same argument is true of the Patchwork method [1]. If the first few 
rows had been deleted then the search for the patches of modified data would 
fail. Consequently, our idea was to divide the image into regions, chosen without 
the need for reference points, that would be consistently chosen, even in the face 
of distortions. 

The other half of the problem is that data integrity has to be assured, in 
terms of the number of terms in each region. Even with standard methods, if 
pixels are lost (or added), then there is a great potential for the transformations 
to fail, or provide inconsistent data. Secondly, there is a need to ensure that 
the data set remains consistent in size, for the benefit of synchronization. When 
a DCT is performed on a sequence, even if only a few terms are missing from 
within the sequence, then the coefficients will vary from those that are expected. 
The same is true for our region generation. If we gather pixels which ’’belong” 
together, then if at the point of detection, not all the pixels are present, then 
any form of transformation (for example, the DWT or DCT) will not be able to 
recover the watermark as it was intended. Likewise, even in the spatial domain, 
the correlation between encountered and expected pixels will render detection 
useless if extra terms are inserted, or pixels are missing. For this reason, in 
pre-processing, the dimensions of an image need to be restored to the original 
form so that detection can proceed. In our system, we are well aware that any 
inconsistencies in the regions between the original and inserted series, will destroy 
the synchronization that is necessary for recovery. 

Assume we have a key, K and an original image /. / is a colour image, with 
three channels: red; green; and blue, and contains M rows and N columns. We 
commence by generating a random number Z, where Z 6 [3,8], depending on 
the value of K. We also use K to choose two colour channels: the source channel, 
Cs and the modified channel Cm- For example, Cg might be the red channel, 
while Cm is the blue channel. They are not permitted to be the same channel. 

2.1 Vector Quantization 

The starting point is to try and separate / into Z distinct regions, R{I)i to 
R(I)z- The reason for this is that we want to divide I into regions, so that 
in the event of cropping, similar regions may be recovered. If the regions are 
chosen based on geometrical properties, such as into squares of 8 by 8, then 
because of the need for reference points, the removal of rows and columns will 
cause the regions not to match up, forcing detection to be performed by brute 
searching. Likewise, we also encounter sensitivity if we use methods such as 
region growing [2]. If the pixels change values, then the regions will be able to 
grow and shrink, potentially causing large variations in the area of each region. 
As we are attempting to extract regions of interest from the image, we now use 
the process of vector quantization to divide up the image. Vector quantization 
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requires that every pixel P^y, in I, where 1 < x < M and 1 < y < be placed 
into a matrix, M . Every row in M is composed of the following items of data 
about each P^y'. 



Table 1. Sample matrix, M 





Cb value 


Standard deviation 


Average 


Pll 


111.0 


60.7 


54.0 


Pl2 


131.0 


58.0 


81.3 


Pl3 


131.0 


59.2 


83.3 











The columns representing the standard deviation and the average are compo- 
sed of the values of the pixels in the neighbourhood, centred on the current pixel. 
Areas outside of the boundary of I are taken to be zero. For our experiments, 
the neighbourhood was of radius 1. 

For example, the row representing Pn is composed of: 

Mil = Pii 

9 E S A - ( E E a.) 

9x9 

1 1 

L 

^-1 j=-l 
9 



Mi2 = ^ 

z 

Mi3 = - 



The three columns listed above are the dimensions from which the space 
is constructed. The first part of the process is to generate an initial codebook 
- a set of vectors that are starting points for determining the centroid points. 
Our codebook is the matrix of Z vectors which have values in each of three 
dimensions: value; standard deviation; and average. We need to have consistency 
in this initial codebook between the original image and the detected image. Thus, 
using K, we randomly choose Z pixels within /. We sample at each pixel to 
recover values suitable for each of the dimensions and place them as our initial 
values for the vector quantization process. 

The process then iterates through this space of data and derives a set of 
centroid points within the space which is guaranteed to be a local minimum in 
terms of distances between them and all the points within the space. The exact 
form of vector quantization is known as k-means clustering, or nearest neighbour 
vector quantization. For a more complete explanation of this process, the reader 
is referred to [6]. The final outcome produces regions where the general objects 
have been identified but are not rigidly adhered to, as would be the case with 
regular segmentation, such as JPEG’s 8 by 8 blocks. 
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We chose vector quantization because it allowed control over the number of 
regions to have, as well as providing an optimum point for splitting up the space 
of the image. We can be assured that the region choices will not differ greatly, 
assuming we use the same initial codebook. Because we do not have the property 
of rigid regions, in terms of size, then the way in which the watermark is inserted 
must be able to be repeated when synchronization is not guaranteed. 

2.2 Inserting the Watermark 

At this point, we have extracted Z regions from an image in a manner which is 
guaranteed to perform identically on the same image but will attempt to recover 
similar regions if the image differs slightly. The regions are dependent on various 
patterns of the distribution of Cg content within /. If we modify this content in 
any way, then detection will be made difficult as the recovery of the regions will 
differ. 

For this reason, we take all the pixels that have been chosen within a region 
R{I)p and extract the Cm components of each pixel. This provides us with a 
vector V{p), where the size of V{p) is equal to the size of R{I)p and each entry 
within the vector is the Cm component of a pixel within R{I)p. Every pixel 
within / will be represented in one of these Z vectors. Now we have generated 
Z distinct distributions of the Cm content of the image, each focussing on a 
different pattern of Cg content. We form this because we plan to modify the 
Cm content, instead of the Cg content, used in a discriminatory fashion. In this 
way, we allow ourselves the privilege of inserting the watermark, while keeping 
the method of region selection independent. Both Cg and Cm are independently 
stored, although there may well be a correlation between them. This means that 
an adjustment in one of the channels will not explicitly affect another channel. 
We are not using any special properties of the colours, as is made clear by the 
fact that they are chosen randomly. 

Now we have a distribution of Cm pixels which may differ in number but if 
the image is intact, we would expect that the distribution would stay roughly 
similar, irrespective of attacks. For this reason, we can now make changes in 
this domain, and expect it to remain as such. We then apply a function to the 
distribution, such as f{x) = x + S, where 5* G N, S' e [—5,5] as determined by 
K. So Vz e [l,size{V{p))\,V{p)z = f{V(p)z)- Other possible functions include 
f{x) = S\x + S ’2 and f{x) = x^^ + S 2 . 

We used a slightly modified version of the first form of f{x) as above, for our 
experiments, to simply demonstrate that the approach is valid. We adjusted the 
function by setting a boundary for the changes, so that we do not produce pixels 
which are outside the possible choices for colour. In this way, if a pixel has its 
Cm content shifted to above 255, then it is held at 255. Similarly, for Cm content 
lower than a zero value, it is also kept at zero. After the distributions have each 
been adjusted, the pixels are returned to their relative locations within I, to 
form the watermarked image. 

The means by which the watermark has been inserted is now no longer depen- 
dent on the full contents of I' being present, nor does it need to be pre-processed 
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Fig. 1. The method for insertion 



before detection takes place. While it will certainly improve the result, the re- 
gions have been determined based on the relationship that each pixel has with 
its neighbours. Also, the watermarking procedure is not dependent on having 
the exact of pixels within each region. As long as the regional distributions are 
statistically similar to those of the original regions, detection should be able to 
proceed. 

2.3 Detection 

The effort for detection is focussed on how to determine if two distributions are 
identical. As the same process can be applied on I, the original image, and J, the 
image to be tested, the task is to compare R{I)p and R{J)p, Vp G [1, Z], Because 
of the process outlined above, we would expect R{J)p = f{R{I)p). There will 
be differences in the distribution, not only because of attacks, intentional and 
unintentional, but also because of the limits of change that can be applied, as 
mentioned above. As noted in the insertion section, / has been adjusted so that 
truncation takes place. For simplicity, we will use the notation oi D = f{R{I)p) 
and D' = R{ J)p and hence our comparisons in this phase will be between D and 
D'. 

Usually when watermarks are detected, the test sequence and the original 
sequence can be paired, term by term. For us, synchronization is not guaranteed 
and we need an unpaired statistical measure to give us the confidence that the 
pixels are from the same series. In order to verify that the two distributions are 
essentially identical, we decided to try and use standard statistical methods. The 
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most appropriate test in this case appears to be the Kolmogorov Smirnov test 
for comparing distributions. The two-sided test comes with the two hypotheses: 

Ho : F{x) = F'{x) 

Hi : F{x) ^ F'{x) 



For our experiments, we wish to show that the null hypothesis is true, in that, 
we have no significant differences between the two distributions. Unfortunately, 
because of the nature of the null hypothesis, the fact that we find no significant 
difference does not imply that we can accept Ho, only that we are unable to 
reject it. 

The Kolmogorov Smirnov test works by comparing the empirical distribution 
functions of two distributions. The greatest difference between these functions is 
recorded and measured against a baseline figure, in order to determine whether 
the null hypothesis can be rejected. This figure is obtained from a formula but 
for small numbers of samples (< 40), it can be read off a table. However, because 
of the large number of samples (numbering in the thousands) , the level at which 
Ho is rejected is unreasonably low. For example, if we have approximately 25,000 
samples in D, then at the 95% confidence level, the radius at which Ho cannot be 
rejected is about 0.01. In light of the way in which attacks and alterations in the 
distribution can be carried out, we feel that this is not an accurate measurement 
of confidence in the likeness of distributions. 

The purpose of the Kolmogorov Smirnov test is to spot significant differences, 
while our own purpose in performing the test is that of trying to measure the 
significant similarities. A large number of samples will only demonstrate that 
there are differences because of the tight restrictions placed. Instead, we need to 
ascertain our own set of limits and conditions on the result of the Z Kolmogorov 
Smirnov tests that are performed. The results of our experiments form the basis 
of these limits. 



The threshold for the difference in edf’s is normally given by the formula 
1.3 6 ^ 95% confidence level, when the number of samples exceeds 

n+yJn/W 

40 [4], although it is noted that for a discrete distribution (such as our case), 
this is a conservative estimate. We need to establish boundaries so that we can 



maximize the power of the test. However, the result of pursuing non-statistical 
boundaries is that the results can only be claimed empirically not statistically. 



Each comparison will yield either a positive or negative result, depending 

on whether it is less than the thresholds set. The proportion of tests passed 

gives the final ’’score” for the detection measurement. So if V was tested, then 

Z tests will be performed on region comparisons. The detection score is d = 

Number of tests passed , ,, - iij. • j-r 

* 100. Ihe next step is to determine the boundaries for 

two distributions to be declared similar. 
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2.4 Detection Thresholds 

If we are to be confident about the similarity of D and D' , there are two intrinsic 
features which need to be compared: 

sup{edf{f{D)) - edf{D')) < a (1) 

abs(f{D) -W) < fi (2) 

Equation 1 states that the two distributions must be similar in accordance 
with the principles of the Kolmogorov Smirnov test. This guarantees that mat- 
ching will only take place if there is some statistical validity. Equation 2 is 
introduced to allow verification of the insertion of the watermark. If the two dis- 
tributions are relatively flat, then the means can differ by a significant distance 
but equation 1 will not fail (as we have elevated the boundary condition above 
that given by statistics). So the trade-off has been to allow some leniency in 
the Kolmogorov Smirnov test with the additional condition of tightness in the 
difference of means. The objective is to determine satisfactory values for both 
a and /3, which reject false positives and allow attacked images to still register 
some evidence of the watermark. 
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Fig. 2. The method for detection 



Untampered recovery. The watermark will be perfectly recovered if there is 
no tampering to an image. Using / for insertion, we know that only the Cm 
channel has been modified, so Cs and will be identical. 



RiI)p = R{r)p,\/pe[l,Z] 



(3) 
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We now extract V{p) and V'{p),\/p £ [1,^2'], which are the Cm channels \/R{I)p 
and R{I')p. However, V'{p) = f{V{p)) for every element in the vectors because 
f is a closed function. Thus the two vectors are equal and hence the distributions 
are identical, giving us a perfect recovery. So comparing D and D', we get a score 
of zero for the Kolmogorov Smirnov test and a score of zero for the difference 
in means - both of which are guaranteed to be less than any positive threshold 
values for a and f3. The reason for this perfect recovery is that as the modifica- 
tions are integral, no quantization effects take place to diminish the strength of 
the natural watermark. 

False positive analysis. We have two axes by which to measure the detection 
of a watermark inside a region, as written above in equations 1 and 2. In order 
to determine what value to give to a and f3, a watermark, K, was inserted into 
the ’’Peppers” image, as shown in Figure 3a, which was of size 256 pixels square, 
and 24 bit colour. The resultant image, I', was tested with a thousand false keys 
and scores for each of the axes were obtained. With each attempt to detect, I' 
was subdivided into Z regions as predicted by the incorrect seed. Each of these 
regions was scored in both tests. The results of these tests was shown in Figure 
5 below. 




(a) I 



(b) /' 



Fig. 3. The watermarking process applied to the peppers image 

Figure 4 indicates the second test is much more discerning than the first, with 
over 75% of the results having a mean difference of over 1.4. 1.1% of comparisons 
resulted in a totally affirmative response, which is, f{D) and D' were found to 
have identical distributions, even though the key was incorrect. This occurs 
because the range of deviation that can be inserted into the discrete domain is 
limited, allowing guesses to occasionally reveal the watermark. The important 
issue is not that it takes place but determining the extent to which it does and 
factoring it into the final confidence level of the watermark. With these results, 
we chose to set a = 0.1 and /? = 0.3. If these parameters are set too low. 
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Fig. 4. The empirical distribution function of regions for 1000 incorrect seeds 



then the robustness of the test will be diminished, while if it is set too high, our 
confidence in positive results are eroded. Based on these settings, the empirically 
determined, false positive error rate is, on average, 4.2% per region. This means 
that when I' is tested for the presence of a watermark, then we would expect 
that for every region involved, there is a 4.2% chance of a false positive result. 

The probability density function for the detection of false seeds is produced 
below in Figure 5, using the parameters for a and (3 as given above. This means 
that if we have a result of 50% in detection, the probability of this happening 
from a false seed is 1/500, as shown in Table 2. 

Table 2. Tabulated scores from Figure 5 



Detection score 


P(False positive) 


10 


0.179 


20 


0.048 


30 


0.027 


40 


0.005 


50 


0.002 


60 


0.002 


70 


0.0 






100 


0.0 



3 Results 

Our experiments were based on the peppers image as shown above in Figure 
3a. We also used /, a, /3 as defined above. We chose 10 random values for K of 
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Empirical dislribulion funclirm 



10(X) incorrect keys I 



Fig. 5. The empirical distribution function for detection using 1000 incorrect seeds 



which each of Z, Cs, and Cm were dependent on the value of K. Our tests used 
the StirMark package to perform the attacks. 

An example of a watermarked image is shown in Figure 3b. The average 
PSNR for these 10 images is 42.61 dB. The detection score in each case was 
d = 100, which equates to an empirical confidence probability of 1. This occurs 
exactly as predicted in the above subsection on detection. 



3.1 Cropping 

In an attack which removes a single row and column has no impact on the 
detection score. Removing 17 rows and 5 columns gave an average region recovery 
oi d = 90.2. There were no false positives at this level, meaning that we have 
full confidence in the result that the watermark is present. When interpolation 
is used, then the results are weakened but it still survives cropping at 5% with a 
score of d = 17.6, which gives p = 0.93. This means that we are 93% certain that 
the watermark is genuine. The purpose was to illustrate that there was no need 
to try and insert extra rows before detection proceeds. Performing pre-processing 
would naturally improve the score. 



3.2 Scaling 

The results of scaling were: 



3.3 Aspect Ratio 

The results of changing the aspect ratio were: 
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Table 3. Detection results for scaling the image 



Scaled percentage 


d 


P 


50% 


11.4 


0.82 


75% 


27.0 


0.97 


90% 


28.3 


0.97 


110% 


40.1 


0.995 


150% 


26.4 


0.97 


200% 


28.4 


0.97 



Table 4. Detection results for changing the aspect ratio 



Scaled percentage 


d 


P 


50% 


11.4 


0.82 


75% 


27.0 


0.97 


90% 


28.3 


0.97 


110% 


40.1 


0.995 


150% 


26.4 


0.97 


200% 


28.4 


0.97 




Fig. 6. Watermarked image with an aspect ratio of 1.2:1 



3.4 Rotation 

When the image is rotated by 90°, the watermark is totally recovered as would 
be expected. When it is rotated by a small amount, say 10°, d = 20.4, which 
gives p = 0.95. The image size is also reduced to 221 by 221 pixels. This is shown 
in Figure 7. 



3.5 StirMark 

StirMark caused considerable damage to the watermark, leaving an average score 
of only d = 11.8, or a probability of success of just 0.82. 
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Fig. 7. Watermarked image with a 10° clockwise rotation 

4 Analysis 

The results do not provide the same level of confidence as many other water- 
marking systems that are available. For example, when we get a score of 0.6 in 
detecting a watermark, this only corresponds to a 99.9% certainty of authenticity, 
which is low compared to other schemes. However, the purpose of this system 
was to eliminate the preprocessing that is required for detection to take place 
and to this end it succeeded. When cropping took place, detection proceeded 
normally and the watermark was shown to exist with full confidence. Rotation 
does obscure the recovery process (except at multiples of 90°). The usage of Stir- 
Mark and other interpolative attacks do cause problems for watermark detection 
and this needs to be improved. 

The reason for the comparatively low score is that we cannot compare samples 
pairwise - before and after the watermarking process. Usually, because samples 
can be compared in this way, we can easily measure the gap between the observed 
and estimated watermark. The great disadvantage of using a technique which 
compares samples pairwise, is that synchronization becomes an issue. If synchro- 
nization is lost, then two samples may not have been intended to correspond to 
one another and the watermark becomes displaced or lost. Our technique suc- 
ceeds without full synchronization because the detection of the watermark is 
based on the properties of distributions, rather than the individual elements 
within it. This avoids the need for preprocessing, although obviously if the dis- 
tribution misses a sufficient number of terms and becomes unrepresentative of 
the original form, then the watermark will not be validated for that region. 

Typically, another means by which synchronization is restored is the insertion 
of a pattern or mesh across the image, which indicates how the image has been 
modified. Recovery of the watermark proceeds after the image has been inverted 
back to its original form. Not only does our technique not use preprocessing 
but also this mesh is usually publicly detectable, allowing an attacker to remove 
it. Our method also avoids the idea of explicitly using reference points, instead 
allowing them to be determined with allowances for error in the regions to be 
recovered. That is, we generate regions within the image that give some concept 
of locality without insisting that they be identically generated each time. 

An attack which would work very well is the image invertibility attack sug- 
gested by Graver et al. Because we are making modifications in the forward 
direction, it is quite trivial to calculate a watermark and subtract it from the 




Region-Based Watermarking by Distribution Adjustment 



67 



image. As suggested by Graver, the set of coefficients we use for modification 
can instead be shifted to the positive domain and the hash of the image be used 
to determine the sign, fn this way, the originality of the image is assured. The 
disadvantage is that this technique only uses between 3 and 8 coefheients, mea- 
ning that for a particular choice of coefficients, the problem of altering the hash 
of an image to choose a pattern of signs is quite trivial. 

5 Conclusion 

There are still many open problems to this technique, such as the optimum 
choices for /, the range from which S may be chosen, the range from which Z 
may be chosen, and establishing if choices for a and [3 can be made which are 
universally applicable. The introduction of other tests which add confidence to 
the recovery of a watermark is also an area which can be worked on, as the process 
is quite modular. The method by which information has been hidden is also not 
necessarily robust to further information hiding, such as re-watermarking. Our 
efforts to hide information in the distribution can certainly be improved. 

This paper presents a specihe example of a general framework which differs 
to most other techniques. In a general sense, there are three main sections pre- 
sented in this paper. The first is the process of segmentation, of dividing I into 
separate regions. The second part is modifying an individuaf distribution in such 
a distinctive way, that W can be inserted. Finaffy, we need an effective way of 
detecting how W has modified the distribution and being able to retrieve it la- 
ter. Each of these can be replaced with a more effective technique but the basic 
principfe of not needing pre-processing wifi stiff hold true. For example, a histo- 
gram specification technique, as proposed in [3] , may be suitable to replace parts 
two and three of this process. Presently, there is also the general disadvantage 
of requiring the original image for detection to compare the distributions. 

The purpose of the technique was always to focus on easily identifying the 
presence of the watermark. This method may be useful for webcrawling, or a 
first estimate of the presence of a watermark. In the event that a suspect image 
is found, it is ffagged for closer inspection and preprocessing may be performed 
then. If an attacker performs a trivial operation, such as flipping it, then the 
watermark will be recoverable. It is hoped that this paper is a step towards 
the efforts of automated watermark recovery, where human intervention is not 
necessary. Whifst this scheme does not have the same power as other methods, 
the abifity for it to overcome arbitrary geometric attacks is of merit. 
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Abstract. The small size of a color palette is utilized to include hidden 
information in the true-color image. Adaptive vector quantization is performed 
to define color palette, which is used for purposes to extract the hidden 
information. The resulting Voronoi tessellation is approximated by 3D cubes to 
define regions centered in the prototype vectors that are used to code hidden 
information. The regions are partitioned into cells the identity of which encode 
the hidden information in each pixel. 



1 Introduction 

A method is described in this paper of how to hide information in a color image. The 
basic idea is that although an image may be a true-color a small color palette defines 
what colors will be displayed on the screen. Therefore additional bits of pixel values 
can be used effectively to hide information such that no visual changes can be 
perceived. 

Not too many results have been reported with color images. Currie et al [1] have 
been working with loss/corruption caused by JPEG-compression in an image. They 
have developed a steganographic method, which was devised based on viewing the 
pixel as a point in space with the three color-channel values as the coordinates. In 
their coding scheme, the length of the vector representing the pixel’s position in three- 
dimensional RGB color space is modified to encode information. 

Fridrich [2] represents two approaches to message hiding in palette-based images 
of GIF and PNG formats: The messages can be embedded into the palette or into the 
image data. Their method embeds secret message to the palette-based images by 
hiding message bits into the parity bit of close colors. In this way, the problem of 
occasionally making large changes in color is avoided. 

Kutter et al [3] represent a digital watermarking method based on amplitude 
modulation. The method embeds single signature bits by modifying pixel values in 
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the blue channel. This method has shown to be resistant to both classical and 
geometrical attacks, and the signature can be extracted without the original image. 

Piva et al [4] have developed DCT-domain watermarking technique for color 
images, which exploits the characteristics of the Human Visual System and the 
correlation between the RGB image channels. The watermark is hidden into the data 
by modifying the subset of coefficients of the image full-frame DCT, and each color 
channel is watermarked separately. The watermark is robust against the most common 
attacks and image processing. 

Fleet et al [5] describe the embedding of amplitude-modulated sinusoidal signals in 
color images. The signal is embedded into the yellow-blue color band of an opponent- 
color representation scheme, and only the high frequencies are chosen. The 
perceptibility of the embedded signal is controlled using quantitative model of human 
visual discriminability. The algorithm is robust to printing and scanning. 

Chae et al [6] propose a robust data embedding scheme in YUV color space, which 
uses noise resilient channel codes based on a multidimensional lattice structure in the 
digital wavelet transform domain. The method enables embedding a significant 
amount of signature data (image) with very little perceptual distortion. The signature 
image is robust for lossy compression. 

In our method color palette is utilized effectively when information is hidden into 
and extracted from the image. The color palette can be fixed or adaptive. In the first 
case, the palette is determined only once and that is used with all images by the 
rendering application. This version is easier to implement but may introduce 
unacceptable quantization effects in the image. In the second case, a color palette is 
constructed for each image. This will produce a better image quality in terms of 
preserving original color content. In this paper the use of adaptive color palette is 
described. 

The size of the color palette defined by an application will set a limit to the amount 
of bits to be hidden into image. The smaller the size the larger is the effect on image 
quality. Our algorithm of data hiding does not deteriorate the visible image quality 
any further for the reason to become clear soon. 

In section 2 the algorithm will be described. Experiments are explained in section 
3. Finally in section 4, we summarize the new technique and conclude the paper. 



2 Methods 



2.1. Adaptive Color Palette Construction 

Vector quantization is performed in the 3D RGB space to define clusters representing 
most dense areas in the color space. A fixed number of prototype vectors can be used 
to create Voronoi tessellation. Alternatively, the number can be user-defined. In the 
second option, only those prototypes are accepted, which do not have neighboring 
prototypes too close to it in the euclidean sense. The selected prototype vectors are 
included in the color palette of the image. Figure 2 presents an example image 
(Surfing), and figure 3 illustrates the color palette of the image by showing prototype 
vectors as colored 3D points in the color space. 
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Fig. 2. Original Surfing-image 




Fig. 3. 3D color palette of surfing-image 



2.2. Approximation of Voronoi Tessellation 

Each Voronoi cell is next approximated with a 3D cube centered in the corresponding 
prototype vectors, which are two-dimensionally illustrated in figure 4. Prototype 
vectors are named as (R , G , B ). Each Voronoi cell is divided into smaller sub- 
cells, the identity of which determines the hidden information to be coded in a pixel. 
Eor example, if a cell is partitioned into 128 sub-cells, 7 bits of additional information 
can be coded in pixels having this color. Because it is assumed that no attacks are 
done on the image, minimally small sub-cells can be defined. The minimum size of a 
sub-cell is computed on the grounds of the resolution of the representation of 
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prototypes in the palette. A minimal change in the final decimals gives a dimension 
for the snh-cell. If the same resolution is used for each color component, each suh-cell 
is a symmetric 3D cube. 

In our implementation, each Voronoi cell is approximated with these minimal 
cubes stacked in 3D space and centered in the prototype vector to form a larger cube 
of cubes, a symmetric super-cube. The super-cube for each Voronoi cell is adapted to 
fit its size such that it is the largest cube, which does not touch the closest cell surface. 
This is demonstrated two-dimensionally in figure 4. In addition, each dimension of 
the cube is 2*' where k is an integer to be defined adaptively. A computationally faster 
implementation is achieved, if the same super-cube size is used for all cells. The size 
would be defined by the cell associated with that prototype vector which has the 
nearest neighboring prototype vector. 

It would be computationally expensive to calculate the decision surfaces explicitly 
in order to determine the maximum size for a super-cube. Therefore, parameter k is 
defined by increasing gradually a super-cube size until one of the corners touches or 
intersects a decision surface. This event can easily be detected by testing whether the 
3D coordinates of a corner are nearer to the prototype vector of the current Voronoi 
cell than any other cell. The search converges quickly because only cubes of size 2‘‘ 
are considered. 







Fig. 4. Example of 2-D Voronoi tessellation and tessellation approximation 



2.3. Hiding Information in Pixels 

In the vector quantization process Voronoi tessellation defines the colors that map to 
the same colors on the display. Our idea is to offset the color value of a pixel within 
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the corresponding Voronoi cell in a controlled manner in order to include hidden 
information. For example, if a 24-bit RGB color representation and 256-sized palette 
are used, on average twenty bits are at our disposal to code additional information in 
each pixel. The super-cubic approximation will loose many bits to be used to code 
hidden information because a large portion of a Voronoi cell is unused. Flowever, it 
results in a simple implementation. 

Flidden information will be included in each pixel of the image while scanning 
through the image in a predetermined order. In our implementation, pixels are 
scanned from left to right and from up to down. Figure 5 represents information 
embedding scheme. The information to be hidden in the image is transformed into a 
bit stream. The bit stream is divided into variable sized segments each of which is 
included in a separate pixel. The size of a segment depends on the size of the super- 
cube associated with the prototype vector of the pixel in question. The size of a 
segment is 3k for a super-cube i. The segment will then divided into three sub- 
segments of size k. 

These three sub-segments are interpreted as 3D coordinates within the super-cube 
and thus identify the sub-cell or minimal cube in question. The new color value of the 
pixel is defined by the coordinates of this sub-cell in the color coordinates system. In 
practice, the new color value is computed by vector addition of the prototype vector 
and the super-cube internal offset vector. 

Now we consider an example of the embedding process. After calculating the 
super-cube, the value of the corner point (R^,G^,BJ and k are stored. This is done to 
all vectors in the color palette. Calculation of the super-cube is illustrated in figure 6. 
After that, the pixels of the original image are processed as follows. 

The nearest point (R„i„,G„,i„,B to the pixel (Rj,G,,Bj) in the color palette is 
calculated using the metrics of Euclidean distance. This is illustrated in figure 7. A 
segment of length I = 3*k, is taken from the bit stream to be hidden. The segment is 
divided into three sub-sections a, b and c of equal lengths. New value to the pixel 
(R,,Gj,Bj) is calculated using the comer point (R_.,G^,BJ. Thus, the deviation from the 
corner point encodes the segment of bits. 

Considering an example, which is illustrated in figure 9, we will assume that k = 4 
and / =3*4=12. If the block of the bit sequence L = { 101 100010101 }, we can have 
sub-blocks a = { 101 1], b = {0001 } and c = (0101 ). The binary sequences a, b and c 
can be expressed in decimal form as a’ = 11, = 1 and c’ = 5. Finally we will get the 

new value to the pixel (Rj,Gj,Bj) as 

Gr = + V . 

B: B^ c’ 

\ ‘ J \ J \ J 

This scheme implements a variable-bit rate method of coding hidden information 
in a color image. The rate depends on the color value in each pixel. The larger the 
Voronoi cells, the more bits can be hidden. Therefore, images containing very 
different colors enable the largest inclusion of information. 

The maximum amount of hidden information can be computed easily after vector 
quantization of input image. The number of bits to be stored in a pixel depends 
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directly on the size of the super-cube associated with it. A histogram can be computed 
of the hits to the prototype vectors. The total number of hidden bits is then computed 
by multiplying the histogram elements by the corresponding super-cube size and 
finally summing up the products. 




Fig. 5. Information embedding in the color image 




Fig. 6. Calculation of the super-cube 



B 
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Fig. 7. Finding the nearest point in the vector quantized color palette 



G 




Decimal 11 1 5 



Fig. 8. Embedding the hidden information 
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2.4. Extraction of Hidden Information 

In order to extract the hidden information, the offset values within each super-cube 
must be determined for each pixel. To facilitate this, super-cubes are determined from 
the color palette defining the Voronoi tessellation. Exactly the same algorithm as in 
embedding process is used to determine the super-cubes centered in the prototype 
vectors. The principle of addressing the super-cube sub-cells is naturally known to the 
user. The extraction scheme is illustrated in figure 9. 

The received true-color image is scanned in the same order as before and each 
pixel is considered in turn. A full search is performed on the color palette to locate 
that prototype vector which has the best match to the pixel value. The 3D coordinates 
of the sub-cell are then computed from the vector difference of true-color coordinates 
and prototype vector of the pixel value. The coordinates are transformed to integer 
values by quantizing them by the minimal sub-cell size defined by the resolution of 
the palette representation. The resulting coordinates are then concatenated to a bit 
stream segment and further concatenated to the overall bit stream of the hidden 
information. 




Fig. 9. Information extraction from the color image 



3 Experiments 



The algorithm was tested with 16 color images (in size of about 256x170 pixels). The 
images were divided into four groups of four images based on their characteristics. 
The images in the first group have only a few colors on the background and only a 
few details. The second group contains close-up images of human faces. The third 
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group comprises of scene images, while the images in fourth group have lots of bright 
colors. All the test images are illustrated in Figure 10. Figure 11 shows a modified 
image from group four when the size of the color palette is 256. The average amount 
of hidden data is 8.1 bpp. Figure 12 shows the sizes of the super-cubes of each pixel 
as a gray-level image. A dark color signifies a large cube. In this case, the maximum 
side length of the cube is 16 bins. 



Group 1 Group 2 Group 3 Group 4 




Fig. 10. Test images 




Fig. 11. Modified Surfing-image 
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Fig. 12. Sizes of the super-cubes for modified Surfing-image 

Table 1 presents the results, when the size of the color palette is 256 colors. 
It can be seen, that if the color palette includes many different colors, it is 
possible to hide more information in the image. Naturally, the sizes of the 
super-cubes are bigger, if the points in the color palette are far from each 
other. The maximum amount of bits hidden was 9.3 bpp, while the minimum 
was 4.3 bpp. 

Table 1. Experimental results, when the size of the color palette is 256 



Group 


Name 


Size 


Hidden data 
(bits) 


Bpp 


Average (bpp) 


1 


Blue 


256x171 


188655 


4.3 


5.2 


Sky 


171x256 


260172 


5.9 


Biljard 


170x256 


214617 


4.9 


Pen 


168x256 


243528 


5.7 


2 


Readpaper 


256x171 


218184 


5.0 


6.1 


Boxing 


256x171 


256662 


5.9 


Manwithphone 


256x171 


282102 


6.4 


Diana 


256x168 


295461 


6.9 


3 


Mountains 


171x256 


321093 


7.3 


6.5 


City 


171x256 


291174 


6.7 


Cliffs 


170x256 


281676 


6.5 


Sand 


171x256 


242310 


5.5 


4 


Surfing 


171x256 


368250 


8.1 


8.6 


Ornament 


256x171 


405783 


9.3 


Ornament2 


256x171 


381108 


8.7 


People 


171x256 


362145 


8.3 
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Table 2 presents the results, when the size of the color palette is 128 colors. 
Comparing with the preceding case, the vectors in the color palette are farther away 
from each other. Thus it is possible to hide more information. In this case, the 
maximum amount of information is 10.7 bpp, while the minimum is 4.8 bpp. 
Decrease of palette size from 256 to 128 entries yields an increase of approximately 
1.5 bpp on average. 

Table 2. Experimental results, when the size of the color palette is 128 



Group 


Name 


Size 


Hidden data 
(bits) 


Bpp 


Average (bpp) 


1 


Blue 


256x171 


208431 


4.8 


5.5 


Sky 


171x256 


263040 


6.0 


Biljard 


170x256 


237729 


5.5 


Pen 


168x256 


249120 


5.8 


2 


Readpaper 


256x171 


270267 


6.2 


7.4 


Boxing 


256x171 


322755 


7.4 


Manwithphone 


256x171 


335235 


7.7 


Diana 


256x168 


354213 


8.2 


3 


Mountains 


171x256 


374277 


8.5 


7.6 


City 


171x256 


349251 


8.0 


Cliffs 


170x256 


319134 


7.3 


Sand 


171x256 


288882 


6.6 


4 


Surfing 


171x256 


419850 


9.6 


9.7 


Ornament 


256x171 


467331 


10.7 


Ornament2 


256x171 


412242 


9.4 


People 


171x256 


398877 


9.1 



Finally, Figure 13 illustrates the amount of hidden bits per pixel to the four color 
images (Blue, Diana, Cliffs and Surfing) with different sizes of the color palette (32, 
64, 128, 256 and 512). One image was tested from each group. 



4 Conclusions 

In this paper, we introduced a new steganographic technique for embedding secret 
messages in true-color images. The method is based on the fact that not all colors are 
displayed of a true-color image if a small color palette is used by an application. A 
vector quantization is performed in order to reduce the number of colors of an image 
and create a color palette. The extra bits are then utilized to encode additional 
information in each pixel by offsetting each color value within the corresponding 
approximated Voronoi cell in 3D color space. The hidden message can be decoded by 
computing the offset values from the modified color values and palette. The 
extraction can be done without the original image. 
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bpp 




32 



64 



128 



256 



512 



color 

palette 

size 



Fig. 13. The amount of hidden hits per pixel, when the size of the color palette is increased 

In the experiments 5-10 message bits were embedded in each pixel of 24-bits. The 
amount of the embedded bits depends on the 3D distribution and the number of colors 
in the color palette. 

Our technique is useful in applications where the robustness of the hidden data is 
not required. Attacks like filtering, compression or requantization destroys the hidden 
information. Future research should include the enhancement of robustness properties 
of the algorithm. In order to enhance security the secret message should first be 
encrypted such that outside parties cannot open the information even though they 
know how to extract it from the image. Public -key encryption methods, for example, 
can be applied for this end. 

There are applications where a given message must be embedded in a given image. 
In this case a small palette must be used in order to have large Voronoi cells. The 
encoding algorithm should then test each allowed palette size in a decreasing order 
until the message can be hidden. It is clear that the quality of the modified image gets 
worse as the size of the color palette decreases, but the amount of hidden bits per 
pixel increases. Thus, when the number of colors is too small, it may be easy to spot 
the difference between the original and the modified image. In consequence, the 
processed image should be previewed before accepting the result. 
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Abstract. Tamper-resistant software has been studied as techniques to 
protect algorithm or secret data. There are many ways to realize tamper- 
resistant software including the method of making software hard to read. 
So far, no objective and quantitative method is known for evaluating 
tamper-resistant software. Most of known evaluation methods require 
involvement of human being. That means their evaluation results deeply 
depend on the skill and subjectivity of human. Therefore, it has been 
expected to devise an objective and quantitative evaluation method in 
place of subjective evaluation methods. In this paper we propose a new 
such method to measure how hard to read. The basic idea is to use the 
parse tree of a compiler for a programming language, and evaluate depth 
and weights of the tree for a code. We give some experimental results to 
examine its effectiveness. 



1 Introduction 

Although software is often distributed in a binary form, it is sometimes distri- 
buted in source code. Typical examples are free application software for UNIX, 
Java applet and codes written in script languages like Perl and Java Script. Me- 
anwhile, software distributed in a binary form may even be transformed into 
source code by reverse engineering. Since source code is often written in an ea- 
sily readable form, these distributed source codes may be analyzed so that their 
algorithms or secret data used in them are extracted by their users. In terms of 
intellectual property rights, it is desired that the owner of software can protect 
algorithm or secret data even in the case of source code. In this respect, crea- 
ting tamper-resistant software is an important research topic [Auc96,MTT97, 
MM098] . Tamper-resistanee is a property such that secret object hidden inside 
is hardly observed or modified from the outside. Software with such attribute is 
called tamper-resistant software. By making software tamper-resistant one can 
ensure that the software operates correctly as originally designed. This means 
that one can enforce software users to obey the designated process of software. 
There is high demand for such software in the electronic commerce systems or 
agent systems. 

Making software hard to read by scrambling codes is regarded as one of ap- 
proaches to creating tamper-resistant software. In this approach the description 
of software is converted into another one which analysts cannot easily read. If 
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the analysts fail to understand the algorithm of software, they cannot properly 
modify the software. So far, several methods have been proposed for making 
software hard to read. For example, several basic operations such as dummy 
code insertion, code replacement and code slmffliiig arc proposed for the assem- 
bly language in [MM098] . Modifying the structure of loop into a complicated 
form is proposed for C language in [MTT97]. Also separating source code into 
modules is proposed for C language [TOM97]. 

It is inevitable to evaluate the difficulty of reading such software created by 
scrambling techniques. In general, we can construct distinct algorithms which 
output the same result but perform in a different way. A good example is the 
sorting algorithm. There are many sorting algorithms such as the bubble sort, 
the quick sort, the heap sort and so on. These distinct algorithms may have dif- 
ferent degree of structural complexity since each algorithm has its own intrinsic 
difficulty of reading it. Hence, we focus on obfuscating the same algorithm rather 
than different algorithms, and compare the difficulties originated from difference 
of representation of the same algorithm. 

With respect to the evaluation of the difficulty of reading tamper-resistant 
software, following methods are known. For the high-level language a subject is 
requested to read source code of tamper-resistant software and time for reading 
it is counted [MTT97]. For the assembly language the distribution of opcodes 
is observed [MM098]. The difficulty of reading software deeply depends on the 
skill and subjectivity of analysts. In this sense, we should create a new evaluation 
method which is not affected by the skill and subjectivity of analysts. The known 
evaluation method for the assembly language shown in [MM098] is considered to 
be objective and quantitative. However, no objective and quantitative evaluation 
method is known for high-level languages. Therefore, we propose an objective 
and quantitative evaluation method and discuss its validity. Note that we do not 
show how large evaluated quantity should be. In order to relate the evaluated 
quantity with human’s ability to read software, large amount of experiments is 
required. This type of experimental evaluation is out of scope of this paper. 



2 Definitions 

Fig. I shows the process of creating tamper-resistant software, where / denotes 
an algorithm or a program for tamper-resistant software, c(/) denotes a source 
code of /, TRC{f) denotes a source code of tamper-resistant software of /, 
and T denotes an algorithm to generate tamper-resistant software. TRC is an 
acronym of Tamper-Resistant Code. As observed in Fig. I, the algorithm T can 



c(f) 



T 



-TRC(f) 



Fig. 1. Process of creating tamper-resistant software 
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be considered as a transducer or a filter to convert a source code c(/) into another 
source code TRC{f). 

We now define {t,m, s) -tamper-resistant software as follows. 

Definition 1 Given parameters {t,m,s), (f,m, s) -tamper-resistant software sa- 
tisfies the following conditions. 



— Let Pc and Ptrc bs (in executable program of c{f) and TRC{f), respectively. 
Let tc and txRC be computational time of Pc and Ptrc, respectively, and 
me and mrRc be memory size of Pc and Ptrc, respectively, and Sc and 
strc be program size ofc{f) andTRC{f), respectively. For given parameters 
{t,m,s), tcGTRC, mc,mTRC, Sc, and strc satisfy 



tTRC 

tc 



< t, 



niTRC 

me 



< m, and 



Strc 

Sc 



< s. 



( 1 ) 



— Pc and Ptrc output the same value for the same input. In other words, Pc 
and Ptrc are software performing in the same way. 



When memory is omitted, we simply write {t, s) -tamper-resistant software. 

Each (t, TO, s)-tamper-resistant software has its own difficulty of reading it. 
Among tamper-resistant software possessing the same difhculty, software with 
smaller (t, to, s) is judged as better software in regard to efficiency. In this con- 
text, very good tamper-resistant software achieves t rs 1, to w 1, and s w 1. 
In many cases the computational time is crucial for software, and the speed- 
down by conversion is not desirable. Under the circumstances we should use 
tamper-resistant software satisfying the condition such that t w 1. 



3 Proposed Evaluation Method 

3.1 Basic Idea 

Computer uses a compiler for translating high-level language like FORTRAN, 
PASCAL and C into machine language which computer can directly execute. A 
compiler is a program that reads a program written in one language called source 
language and translates it into an equivalent program in another language called 
target language. Conceptually, a compiler operates in the following phases one 
by one: lexical analysis, syntax analysis, semantic analysis, intermediate code 
generation, code optimization and code generation. 

Among these phases we focus on the operation of syntax analysis. In the 
syntax-analysis phase a parser obtains a string of tokens and verifies that the 
string can be generated by the context-free grammar for its source language. 
This operation corresponds to the verification of the grammar of a source code 
in programming language. By the syntax analysis we can draw a tree called a 
parse tree as illustrated in Fig. 2. The root of a parse tree is labeled by the start 
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I for (i=0;i<N;i++){a=a+l;} | 

statement 

iteration_statement 

I ^ ^ ^ ^ ^ ^ ^ I 

FOR ( expr_opt ; expr_opt ; expr_opt ) statement 

EXPRESSION EXPRESSION EXPRESSION I 

“i=0” “i<N” “i++” 

compound_statement 

I 1 1 

{ statement_list } 

I 

statement 

I 

Boldtype = token expression statement 

I 1 

EXPRESSION ; 

“a=a+l” 

Fig. 2. Parse tree of C program containing loops 



symbol, each leaf is labeled by a terminal symbol, and each interior node is 
labeled by a nonterminal. 

We deal with expression as a terminal symbol. In Fig. 2, “i=0”, “i<N”, “i++” 
and “a=a+l” are all written as EXPRESSION representing a terminal sym- 
bol. In the general compiler theory expressions are not dealt with as a termi- 
nal symbol and parsed further into components. However, we put stress on the 
syntactical complexity of the source language such as the combination of “if” 
statement and “for” statement, and we estimate that syntax of expressions 
does not contribute to this type of evaluation. Thus, we do not conduct syntax 
analysis for expressions and deal with expressions as terminals. 

The translation of a compiler is regarded as a sequential operation of rea- 
ding, analyzing and understanding a source language. Especially, the compiler 
analyzes and understands a source language syntactically in the syntax-analysis 
phase. Such an operation is exactly what a human performs in case of reading 
source code. Therefore, we use a parse tree for evaluating the difficulty of reading 
tamper-resistant software. 

The depth and the structure of a parse tree become larger and more com- 
plex, respectively, as the syntax of a program becomes more complex. We give 
examples of the following two C programs containing loops. 



One loop. 



for (i=0 ; i<N; i++) { a = a + 1; } 
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Two loops. 

for (i=0 ; i<N; i++) { 

for (j=0; j<N; j++){ 
a = a + 1 ; 

} 

> 

Their parse trees are shown in Fig. 3, where terminal symbols and tokens are 
omitted. As shown in Fig. 3, more complex the syntax of a program becomes, 
more complex and deeper the parse tree becomes. We use this property of parse 
trees for the evaluation. 




Fig. 3. Structure of parse trees 



3.2 Evaluation Procedure 

Rule for evaluation. In order to concentrate on the most basic procedure of 
an objective and quantitative evaluation, we evaluate the complexity originated 
from a single parse tree even in software represented by multiple parse trees. 
The complexity originated from multiple parse trees will be dealt with in other 
occasion. 

We set the following rules for the evaluation. 

Rule 1: Weigh edges of a parse tree by the following sub- rules. 

Rule 1.1: Set an initial weight into all edges of parse trees both for original 
source code c(/) and tamper-resistant code TRC{f). 

Rule 1.2: Only for tamper-resistant code TRC{f ), change weight of edges 
of its parse tree depending on the algorithm used for generating TRC{f). 
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Rule 2: Output the maximum weight among all sums of weight from the root 
to each leaf of a parse tree. 

We call the output of the rule 2 points of its source code. 

An appropriate evaluation criterion properly evaluates the change of the 
structure of a parse tree originated from the difficulty of reading software. In 
place of rule 2 described above, we may be able to use two different rules: (I) 
the number of nodes or (II) the width of a parse tree. However, these values are 
not appropriate from the following reasons. (I) Nodes represent terminals and 
nonterminals. That means the number of nodes increases as the size of a source 
code increases, and one gets large evaluation points for some kind of tamper- 
resistant coding which simply increases the size of the source code. (II) Usually 
the width of a parse tree becomes larger as the size of a source code increases. As 
in the case (I) , one gets large evaluation points for some kind of tamper-resistant 
coding which simply increases the size of the source code. The methods of (I) and 
(II) will fail to express the syntactical complexity of a source code. In contrast, 
the proposed method uses the depth of a parse tree as defined in the rule 2. 

The sum of weights from the root to a leaf of a parse tree represents the 
depth of the leaf in the tree and we regard this value as the degree of difficulty 
of reading the corresponding part of software. For the evaluation of the whole 
parse tree, we just adopt the rule 2 rather than other method such as taking an 
average of all sums of weight from the root to each leaf of a parse tree. We can 
express the complexity of the most difficult part of the software by the maximum 
weight of the sums counted in the rule 2. 



root root 




• : Terminal _ _ _ 



Fig. 4. Evaluation rule 



The rule mentioned above is exemplified in Fig. 4. For a parse tree at the left 
side, all edges weigh one. A parse tree of a tamper-resistant code generated by 
an algorithm T is shown at the right side. In this example newly created edges 
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increase their values into 2 and 3 in accordance with the rule 1.2. According 
to the rule 2, the points of the tamper-resistant software is determined by the 
maximum sum of weights: l-|-l-|-l-|-2-|-3 = 8. 

Assessing grades. We explain how to assess grades of an algorithm of gene- 
rating tamper-resistant software. In the following, without loss of generality, we 
explain our evaluation method in case where two generating algorithms are com- 
pared. Suppose one original source code c(/) is converted into two source codes 
of tamper-resistant software {TRCa{f),TRCb{f )} by two different generating 
algorithms {Ta,Tb}. As shown in Fig. 5, ANALYZER obtains a source code and 
conducts a syntax analysis. As explained before, a parse tree is generated and 
the maximum sum of weights of the parse tree is output. It is the points of the 
source code of tamper-resistant software. In order to change weights based on the 
rule 1.2, ANALYZER receives knowledge of generating algorithm, i.e. what kind 
of algorithm is applied to which part of the source code. Knowledge{TRCa{f j) 
and Knowledge{TRCb{f)) in Fig. 5 correspond to such information. 

For the evaluation, at first, the points Xo(c(/)) of an original source code c(/) 
is calculated by ANALYZER. Then the points {xa{c{f)),Xb{c{f))} of tamper- 
resistant codes {TRCa{f),TRCb{f)} are calculated by ANALYZER. If we want 
to estimate the credibility of the absolute values of {xa{c{f)), Xb{c{f))}, we need 
to conduct experiments involving human being. Nonetheless, we can use the rela- 
tive values {xa(c(/)) — Xo(c(/)), Xb{c{f)) —Xo{c{f))} for assessing the difficulty of 
each tamper-resistant code. We call such a value the grade of a tamper-resistant 
software. 



c(f) ► ANALYZER — ►Xo(c(/)) 



c(f) 

c(f)- 



Knowledge(TRC^ (/)) 

Grade 

~T^TRCXf)^ ANALYZER ^x„(c(/))^ X„(c(/))-X„(c(/)) 



^TRQ(f)- 



ANALYZER 



T 



Knowledge{TRC^ (/)) 

Fig. 5. Assessing grades 



Y^(c(/))-Y„(c(/)) 
Grade 



If the grades {xa{c{f)) ~ Xo{c{f)), Xb{c{f)) - Xo{c{f))} satisfy Xa{c{f)) - 
^o{c(f)) > Xb{c{f)) — xo{c{f)), TRCaif) is more complex than TRCb{f) with 
respect to c(/). In other words, TRCa{f) is more difficult to read than TRCb{f) 
with respect to c(/). 

As easily inferred, the fact that TRCa{f) is more complex than TRCb{f) 
indicates that is a better algorithm than Tf, in terms of generating complex 
tamper-resistant software. Thus the grades proposed above can be regarded as 
the grades of algorithms to generate tamper-resistant software. 
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Until now, we have explained our idea only in a case of one original source 
code c(/). We can extend our approach to cover multiple original source co- 
des. For n distinct source codes {c(/i), c(/2), c(/3), c(/„)}, we first apply our 
approach to each source code. Then obtained grades are processed by some sta- 
tistical method like arithmetic mean. The result of the process is the grade of 
an algorithm to generate tamper-resistant software. Suppose we apply two gene- 
rating algorithms {Ta,Tf,} to {c(/i), c(/2), c(/3), ..., c(/„)}, respectively, and ob- 
tain {{TRCa{fi),TRCb{fi)}, {TRCa{f2),TRCbif2)}, {TRCa{f 3 ),TRCb{f 3 )}, 
..., {TRCa{fn),TRCb{fn)}}- For the evaluation, we use ANALYZER for getting 

{{a;a(c(/i)),a;6(c(/i))},{a:a(c(/2)),Xb(c(/2))}, 
{Xa{c{f3)),Xb{c{f3))}, {Xa{c{fn)),Xb{c{fn))}}. 

Meanwhile, points of {c(/i), c(/2), c(/3), ..., c(/„)} are determined by ANALY- 
ZER as {a;o(c(/i)),a;o(c(/2)),a;o(c(/3)),...,a;o(c(/„))}. j,Erom these points, we 
can compute 



{{a;a(c(/i)) - xo{c{fi)),Xb{c{fi)) - xo(c(/i))}, 

{a^a(c(/2)) - Xo{c{f2)),Xb{c{f2)) ~ 3^0 (c(/2)) }, 

{a^a(c(/ 3 )) - Xo{c{f 3 )),Xb{c{f 3 )) ~ Xq W/s)) }, 

•••,{a:a(c(/„)) - Xo{c{fn)),Xb{c{fn)) ~ Xo{c{f „))}■} 

If we adopt arithmetic mean, we finally obtain 

^ n 1 ^ 

Averagea = - V'{a^a(c(/i)) - xo{c{fi))}, Averagcb = - y^ixb{c{fi)) - Xo{c{fi))}. 
2=1 2=1 

4 Experiment 

Based on the proposed evaluation method we evaluate algorithms to generate 
tamper-resistant software. In our experiment source codes are written in C lan- 
guage. But we do not need to restrict ourselves to C. The proposed method can 
be applied to any high-level language. Instead of the entire software, the tamper- 
resistant coding can be applied only to some parts of software. The minimum 
unit is a function of C language. That means the minimum unit of input c(/) to 
conversion algorithms is a function. In the following experiment we take small 
programs as an input c(/). 

4.1 Generating Algorithms 

The following three algorithms are adopted in our experiment: 

— Dummy code insertion, 

— Replacement of function, 

— Modification of loop. 
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Dummy code insertion. Figure 6 shows operation of dummy code insertion. 
A part of original C source code is randomly selected, and the selected code is 
inserted at a randomly selected place of the original source code after adding 
commands of skipping the inserted part such as “while (0){ ... Although 
a very simple condition is used in this example, we can use a more complex 
construction for avoiding the detection and execution of the inserted code. Con- 
cerning the change of weight discussed in the rule 1.2 of Sect. 3. 2, the weight 
is not changed by the dummy code insertion. In Fig. 6, the initial weight W is 
equal to 1 and the weight IF of a tamper-resistant code is also equal to 1 by the 
calculation of IF = IF + 0. 




Replacement of function. Figure 7 shows the new idea of replacement of 
function. A standard function of C language is expanded into a program and the 
name of the function is changed. An example is depicted in Fig. 8. Concerning the 
change of weight discussed in the rule 1.2 of Sect. 3. 2, the weight is incremented 
by one with respect to this conversion algorithm. In Fig. 8 the initial weight IF 
is I and the final weight IF is computed byIF = IF + I = I + l = 2. 




Fig. 7. Replacement of function 
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int func_sum(int *x) 



int func_sum(int *x) 

{ 

int sum = 0; 
int i = 0; 

for (i = 0; i < n; i++) 
sum = sum + *(x + i); 
fprintf(stdout, ”%d\n ", sum); 
return (sum); 



int sum = 0; 
int i = 0; 

for (i = 0; i < n; i++) 
sum = sum + *(x + i); 
STRING(stdout, ”%d\n ", sum); 
return (sum); 



-W+l=2 

1 



W:weight, initial weight=l 



int STRING(FILE * stream, const char ^format, ...) 
I 

int vfprintfO; 
int r; 

typedef void *va_list; 

va_list args; 

va_start(args, format); 

r = vfprintf(stdout, format, args); 

va_end(args); 

return r; 

1 



Note that yen mark means backslash. 



Fig. 8. Example of replacement 



Modification of loop. By the modification of loop, a loop is replaced with 
a new loop which has a different expression but has an equivalent structure in 
terms of number of operations, substitutions, comparisons and so on. We give an 
example of such modification in Fig. 9. There are several rules for modifications 
described in [MTT97] including the modification shown in Fig. 9. The following 
evaluation procedure is applicable to any modification rule. 

Figure 10 shows an example of conversion into a tamper-resistant form. Con- 
cerning the change of weight discussed in the rule 1.2 of Sect. 3. 2, the weight is 
incremented by the increased number of nest of loops after the conversion. In 
Fig. 10, the tamper- resistant coding is applied to “for” statement of the initial 
code, which is expressed by one loop. After the conversion, loop is nested three 
times. Since “if (i<n){ . . .}” is the first and the largest loop of the nest, its 
weight W is not changed and W = IT -I- 0 = 1. For “f or ( ; ; ) { ... >” , it is the 
second largest loop of the nest and W = IT -I- 1 = 2. Likewise, the weight of 
“if (! (i<n)) break;” is computed by IT = IT + 2 = 3. 

Another example of conversion is illustrated in Fig. 11. In this example “goto” 
statement appears in the tamper-resistant code and goto —>■ Label contains lines 
appearing from LI to L2. Therefore, “goto” statement and Label is regarded as 
one kind of nest and its weight is incremented by one as in “for” statement and 
“if” statement. 
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4.2 Experimental Results 

Experimental methods. In this section we apply algorithms explained in 
Sect. 4.1 to three short programs, program 1 for calculating sum, program 2 
for FFT and program 3 for the primality testing by the Eratosthenes’s sieve. 
ANALYZER is constructed by combining a lexical analyzer lex and a syntax 
analyzer yacc. Algorithms for conversion is either used (i) alone or (ii) combining 
two out of them. 



Results. Table 1, Table 2 and Table 3 show evaluation results of program 1, 
program 2 and program 3, respectively. In these hgures D, R and L represent 
Dummy code insertion. Replacement of function and modihcation of Loop, res- 
pectively. An algorithm in the raw is applied at first and then an algorithm in 
the column is applied. In Table 1, the grade of tamper-resistant code is 30 when 
D is applied to the original source code after L. This value is also considered 
as a value of algorithm “LD” with respect to the original source code. In Table 
1 the points of the original source code is 18 and the grades expressed without 
parentheses are derived by adding -18 to the points expressed between parenthe- 
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ses. For instance, the combined algorithm “DD” is evaluated as grade 11 by the 
calculation 29-18=11, where 29 is the points of the tamper-resistant code and 18 
is the points of the original source code. A pair of values represents parameters 
(t, s) mentioned in Definition 1 of Sect. 2. 



Table 1. Grades of program 1 (Original source code has 18 points) 





Alone 


D 


R 


L 


D 


5(23) 

(1.02.1.07) 


11(29) 

(1.02,1.23) 


11(29) 

(1.01,2.01) 


28(46) 

(1.00,1.22) 


R 


4(22) 

(1.01,1.67) 


6(24) 

(1.01,2.20) 


6(24) 

(1.01,2.23) 


18(36) 

(1.01,1.82) 


r 


13(31) 

(1.00,1.15) 


30(48) 

(1.00,1.25) 


18(36) 

(1.02,1.82) 


53(71) 

(1.00,1.30) 



Table 2. Grades of program 2 (Original source code has 32 points) 





Alone 


D 


R 


L 


D 


6(38) 

(1.01,1.11) 


7(39) 

(1.00,1.20) 


18(50) 

(1.00,1.32) 


41(73) 

(1.01,1.21) 


R 


18(50) 

(1.00,1.21) 


18(50) 

(1.00,1.24) 


20(52) 

(1.00,1.61) 


25(57) 

(1.00,1.31) 


r 


23(55) 

(1.00,1.10) 


26(58) 

(1.00,1.11) 


25(57) 

(1.00,1.31) 


50(82) 

(1.00,1.20) 



Table 3. Grades of program 3 (Original source code has 25 points) 





Alone 


D 


R 


L 


D 


3(28) 

(1.00,1.18) 


11(36) 

(1.00,1.41) 


5(30) 

(1.02.1.73) 


13(38) 

(1.00,1.27) 


R 


2(27) 

(1.01,1.55) 


2(27) 

(1.01,1.68) 


4(29) 

(1.01.2.07) 


14(39) 

(1.01,1.64) 


r 


12(37) 

(1.00,1.09) 


17(42) 

(1.00,1.26) 


14(39) 

(1.01,1.64) 


38(63) 

(1.00,1.22) 



The execution time of each program is assessed by observing user time on 
SUN UltralO(Ultra SPARC-IIi/333 MHz,OS:Solaris7). After executing each pro- 
gram 1000 times, the execution time is computed by arithmetic mean. The pro- 
gram size does not count return and space. As for parameters (t, s), the ma- 
ximum value (1.02,2.23) of (t,s) is the underlined in Table 1. That means the 
results for 12 tamper-resistant codes in Table 1 can be considered as the results 
for (1.1, 2.3)-tamper-resistant software. In the similar context, the results for 12 
tamper-resistant codes in Table 2 can be said as the results for (1.1, 1.7)-tamper- 
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resistant software, and the results for 12 tamper-resistant codes in Table 3 can 
be said as the results for (1.1, 2.1)-tamper-resistant software. 

By following the method in Sect. 3. 2 the algorithms for generating tamper- 
resistant software are evaluated with respect to three source codes. Table 4 shows 
the grades of 12 tamper-resistant codes, D, DD, DR, DL, R, RD, RR, RL, L, 
LD, LR, and LL. We used arithmetic mean for computing the grades. 

Table 4. The grades of tamper-resistant software with respect to program 1,2,3 





Alone 


D 


R 


L 


D 


4.7 


9.7 


11.3 


27.3 


R 


8.0 


8.7 


10.0 


19.0 


L 


16.0 


24.3 


19.0 


47.0 



4.3 Remarks 

We can observe from tables 1,2, 3, 4 that high grades is recorded when the algo- 
rithm L to modify the structure of loop is utilized. Especially, when L is used 
twice, the highest grade is recorded in any type of original source code. 

Generally speaking, a parse tree has a deeper node when the syntactical 
structure of software becomes complex. The proposed evaluation method uses 
the depth of the parse tree for the measure of difficulty of reading software. In this 
sense, an algorithm converting one expression of software into another expression 
with a deeper parse tree is evaluated as a difficult conversion algorithm. As 
observed above, an example of such algorithm is the algorithm to modify the 
structure of loop. 

5 Conclusions 

In this paper, we have proposed an objective and quantitative evaluation method 
of tamper-resistant software by paying our attention to syntax analysis of com- 
pilers. We have developed software for the proposed evaluation and conducted 
experiments. The experimental results actually show that an algorithm conver- 
ting one expression of software into another expression with a deeper parse tree 
is evaluated as a difficult conversion algorithm. 

Throughout the evaluation we have only evaluated the complexity originated 
from a single parse tree even in software represented by multiple parse trees. This 
is because we have concentrated on the most basic procedure of an objective 
and quantitative evaluation method. In the future work we will examine a new 
evaluation method for software which is parsed into multiple trees. 

We should also extend our experiments in order to better understand rela- 
tionships between the proposed evaluation method and elementary algorithms 
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for generating tamper-resistant software. Furthermore, it is important to clarify 
the relationship between the proposed evaluation method and the evaluation 
involving human being. With clear knowledge on it, we can estimate a desirable 
grade from which we can judge a software to be tamper-resistant. 
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Abstract. A digital fingerprint is a unique pattern embedded in a di- 
gital document to be able to identify a specific copy when it is used 
illegally. We have looked at two specific code structures for fingerprin- 
ting purpose. Binary linear codes, often used as error correcting codes, 
and what we call a binary sorted code. 

Keywords: fingerprinting, copyright protection, watermarking 



1 Introduction 

Every year the legal distributors of computer programs, music CDs and movies 
lose billions of dollars because of illegal copies. They have tried to prevent this by 
restrictions on the information carrier, for example the floppy disc, the CD, the 
video tape or the ordinary book. This has been more or less successful. Today 
however most of the information is digital and very easy to copy. With the in- 
creasing use of the Internet more and more information is spread in digital form. 
Maybe you need a special environment to run your program, or the information 
is encrypted so you need a special key to read it. But in the end you have got 
to be able to use the information, otherwise the product is of no use to the legal 
buyers. And when you have the information it is very difficult to prevent copying 
and illegal distribution. 

Our belief is that preventing illegal copying is very hard, so our solution is 
to mark each copy uniquely and if it is spread illegally the original owner of that 
copy can be found. The principle is the same as a serial number on a product, but 
a serial number is often easy to find and remove. The serial number, or what we 
call “fingerprint” , has to be embedded in the information itself, without changing 
the product. 

We will assume that the embedding can be done and concentrate on the 
structure of the “fingerprint”. In this paper we look at two specific structures 
on codes. We analyse these code structures when used as fingerprinting codes, 
trying to find characteristics that are good for a fingerprinting scheme. 

1.1 Related Work 

The earliest paper on fingerprinting that we have found is [11], which contains a 
general description of the idea. The first paper on fingerprinting in the presence 
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of collusions is [2] , where a fingerprinting scheme is presented where the number 
of copies an opponent must obtain in order to erase the fingerprints is specified. 
An important paper in the area of collusion secure fingerprinting is [3] . 

There is a lot of research in areas connected to fingerprinting, like water- 
marking and steganography. With watermarking we mean the act of marking 
an digital object with the same watermark in each copy, for example to prove 
ownership. Steganography is the act of hiding information in other information, 
here the distributor do not want others to know there is anything hidden in the 
innocent object. 

Most of the research we have found is on the embedding of the information, 
where the information may be the fingerprint or the watermark. Many papers 
on how to embed in images are written, for example [5] and [6] . 

Asymmetric fingerprinting is where the distributor and the user work to- 
gether to create the fingerprinted copy. This gives that an dishonest distributor 
can not recreate the fingerprinted copy and accuse an innocent user. For more 
about this see [7]. If the underlying fingerprinting scheme is secure this can be 
done. 

Anonymous fingerprinting goes even further, here the distributor does not 
know the users identity. The users identity can only be uncovered when his copy 
is used illegally and found. This can also be done when a secure fingerprinting 
scheme exists, see [8]. 

Another area connected to fingerprinting is traitor tracing, first introduced 
in [4]. A paper comparing fingerprinting and traitor tracing is [10] and a traitor 
tracing system can be found in [1]. 

2 Notation 

In the document we have a number of positions where alterations can be made 
without changing the content of the document. Each of these positions have, in 
our binary case, two bits. We denote these bits {0,1}. A vector of n bits is a 
fingerprint or codeword. 

A copy of the document is marked with a unique codeword and given to a 
user. We have M users and therefore also M codewords, i.e. every user has a 
different codeword then the other. These M codewords are called a fingerprinting 
code. The code can be represented as a code matrix, T, with the codewords as 
rows. 

In every position a bit is represented with a specific symbol in the document 
this is called a mark. Since we have binary code there will be two marks in every 
position, i.e. one for each bit. The marks are independent of each other. 

A user who is trying to use his document illegally is called a pirate. There can 
be more than one pirate working together to create an illegal copy. We denote 
the number of pirates c. The pirate, or pirates, uses a strategy to create an illegal 
copy. The fingerprint extracted from this illegal copy by a tracer is called a false 
word. We denote the false word z. 
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3 Binary Linear Codes 

The fingerprinting problem has many similarities to telecommunication where 
error correcting codes are used. If we use some of the error correcting codes 
considered in the coding theory area we can apply their structure and theory 
when evaluating the codes for the purpose of fingerprinting. The most important 
and well-studied class of codes is linear codes. In this section we try binary linear 
codes for fingerprinting purpose. 

3.1 The Problem with Binary Linear Codes 

A linear code is homogeneous and additive. This means, among other things, that 
if you add two or more codewords in every position you get a new codeword. So 
if the pirates can add modulo 2 they can create a codeword. 

We will define a strategy for the pirates where they choose the bit which 
appears an odd number of times. This can only be done by a pirate group with 
an odd number of pirates. 

Definition 1. The Modulo Two strategy in a binary code is when an odd num- 
ber of pirates in every position choose the bit which appears an odd number of 
times. 



Example 1. If three pirates have the embedded codewords 

a f r t 
a f ry 

dfjy 

they will get the false word {dfjt}, if they use the Modulo Two strategy. 

As can be seen in the example the undetectable positions will automatically 
be set to the bit which appears an odd number of times since the pirate collusion 
has an odd number of pirates. 

In the lemma below we show that the Modulo Two strategy corresponds to 
modulo 2 calculation of binary codewords. 

Lemma 1. An odd number of pirates using the Modulo Two strategy gives the 
same false word as if the same pirate codewords, in every position, are added 
modulo 2. 

Proof. Regard a particular position. Either there is an odd number of Os and an 
even number of Is, yielding the modulo 2 sum 0, or there is an even number of 
Os and an odd number of Is, yielding the modulo 2 sum 1. □ 

Remember that every pirate has a unique codeword, i.e. two pirates can not 
have the same codeword. With Lemma 1 and the linearity of the codes we can 
state the following. 
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Theorem 1. An odd number, greater than one, of pirates in a fingerprinting 
system with a binary linear code can create a codeword that is not among the 
codewords in the pirate group. 

Proof. According to Lemma 1, an odd number of pirates can add every position 
in their codewords modulo 2 and in a binary linear code if you add codewords 
modulo 2 you get a new codeword. 

There are c pirates, c is odd, and we denote the set of their codewords, 
P = {pi,p2, ■ ■ ■ ,Pc} G C. C is the code. 

Take two of the codewords, Ai = {oi, 02} G P,ai a^. 

First choose 03 e P,a^ A\. Add these three codewords, ai + Q2 + 03, this 
gives a codeword 0,4 E C . li 04 f P we have a codeword in the code but not in 
the pirate group that P can create and the theorem is true. So we assume that 
04 G P. If 04 G Ai U {03} two of the codewords in Ai U {03} must be equal so 
04 ^ Ai U {03}. Create A3 = Ai U {03, 04}. 

Assume that it is true for o, G P, Oi ^ A^_2 that oi +02 +Oi gives a codeword 
Oj+i G C that is either not in the pirate group or o^+i G P but Oj+i ^ Ai_2U{oj}. 

Choose Oi_|_2 G P, Oi_|_2 ^ Ai. Add the three codewords, oi + 02 + Oj+2, this 
gives a codeword 0^+3 G C. If 0^+3 ^ P we have a codeword in the code but not 
in the pirate group that P can create. 

So we assume that 0^+3 G P. To see if 0^+3 G A^ U {0^+2} we must check two 
cases, if Oi_|_3 G Ai or aj_|_3 = 0^4.2. 

If ai+3 = Oi+2 we have that ai = 02 which can not be true. 

If aj+3 G Ai we assume that ai+3 = x G A* which gives the equation ai + 
02+ X = ai_|_3 and since we have that ai + 02 + ai_|_2 = aj_|_3, aj+2 = x which can 
not be true. 

We have now proven that ai+a2+aj, oi, 02, Oj G P gives a codeword a^+i G C 
that is either not in the pirate group, Oj+i ^ P, or in the pirate group, a^+i G P, 
and Oj+i f Aj_ 2 U {aj. 

As long as k < c there can exists an Ok so that Ok Ak-z U {afc_i}, Ok G P. 
For a pirate group of odd number the last equation is oi + 02 + Oc = Ok and no 
such Ofe exists and Ok must be a codeword not in P. 

□ 



Cosets of Binary Linear Codes. To construct a coset to a binary linear code 
you add a binary vector to each codeword. This is not a linear code, as can most 
easily be seen by the fact that the zero word may not be included in the new 
code, so perhaps the Modulo Two strategy does not work on these codes. 

We look at three codewords in a coset of a binary code in an example. 

Example 2 . Take a binary vector a and some binary codewords in a linear code. 
Cl, C2 and C3. Let = a + Ci, U2 = a + C2 and V3 = 0 + 03. Add vi and V2 modulo 
2; (a + Ci) + (a + C2) = (ci +C2), which is a word in the linear code, but not in the 
coset code. Now add Ui, V2 and V3; (a+Ci) + (a+C2) + (a+C3) = ((ci + C2+C3)+a), 
which is a codeword in the new coset code. 
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Since the Modulo Two strategy in the fingerprinting scheme always uses an 
odd number of codewords the pirates will always end up with a codeword in the 
new coset code. 

This means that the coset of a linear code is not an option for a fingerprinting 
code, using the same argument as for the linear code. 

3.2 Conclusions for the Binary Linear Codes 

To use a binary linear code as fingerprinting code will not work if three pirates 
collude. If more than three pirates collude, their best strategy is to combine three 
of the pirate fingerprints and see to it that the ones left unused do not have the 
codeword made by the others. In a pirate group of odd number this can always 
be done. 

The cosets of binary linear codes have the same weakness. 

This does not mean that the structure of linear codes can not be used. For 
example what happens if we use a subset of the codewords in a linear code? 
In this section only binary codes were considered. We might discover that non- 
binary codes can be used. 

4 Sorted Codes 

The purpose of this section is to examine what we call a binary sorted code in 
order to see what pros and cons there are for using this structured code, when 
the number of pirates is n—1. This code, especially when the number of pirates 
is all but one user, is most suited as an inner code of a concatenated code system 
for fingerprinting. 

For this section we will make an assumption on the pirates behaviour from 

[3]. 

(The Marking Assumption). 

Colluding pirates can detect a position if, and only if, the bits differ between 
their copies. They can not change an undetectable bit without destroying the 
fingerprinted object. 

This means that the pirates will only be able to change the detectable posi- 
tions and 

will have to leave the rest of the document intact. This is a reasonable as- 
sumption 

since the pirates have no knowledge of the locations of the undetectable 
positions. Therefore to be able to change some of the undetectable 
positions they have to change many locations not in the fingerprint, 
probably degrading the document to useless. 

With this assumption the case with a single pirate is trivial since no changes 
will be made in the illegal copy and the tracing algorithm will have no problem 
finding the guilty user. The problem is a collusion of two or more pirates. They 
will always have some detectable positions to work with. 

We will define a sorted code as below. 
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Definition 2. A binary code matrix, F , is sorted if all columns have the Is in 
the top rows and the Os below. 

Without loss of generality we can always order the columns in increasing 
weight and we will do so in this section. 

We will call the unsorted code matrix Funsort and the sorted code matrix 

d^sort • 

Below we make two definitions on the columns of the code matrix. 
Definition 3. jj is the set of columns with Flamming weight j. 

Definition 4. A sub-column is a column in a code matrix without one of the 
rows. 

The pirate group will see sub-columns of the code. 

Lemma 2. In a sorted code a group of n — 1 users will be able to group the 
columns with the same weight in their sub-columns, together. 

Proof. The pirates can only see the marks, i.e. the embedded bits, and will not 
know which mark is for the Os and which is for the Is. But in a column, the 
marks for the Os are the same and the marks for the Is are the same. 

Choose a column with one mark of a specific sort and n — 2 marks of the other 
sort. The row with the single mark will be all Os or all Is due to the structure 
of the sorted code. From this row, the pirates can tell which marks in the code 
have been embedded from the same bit, since in each column they will know one 
specific mark representing the same bit. From this, the columns can be grouped 
together even though they do not know which mark is which bit. □ 

The jjS are ordered from the lowest weight to the highest. Since the pirates 
can not tell which mark is 0 and which is 1 they may order the sets of 7 ^ from 
the highest weight to the lowest instead, without knowing. We will say that they 
have ordered the sets in reverse. 

Theorem 2 shows that in every code matrix, not only sorted code matrices, 
when c = n — 1 and all sub-columns of the same weight can be grouped and 
ordered, there has to be at least one column of every weight. If not, there exists 
a specific word that every group can create with probability one. When this 
word is found the tracer gets no information about the pirates and can not 
accuse anyone. 

Theorem 2. A code where the pirates can group the together and order them, 
must have at least one column with every Hamming weight j G {1, 2, • • • , n — 1}, 
otherwise there exists a false word that every group with n-1 users can create 
with probability one. 

Proof. Assume that there are no columns with weight j*. Let j be the weight 
of a sub-column. A sub-column is a column in the part of F seen by a group of 
n — 1 users. Here j is the weight of the columns in F. 
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The same false word, 2 ; = {zi,Z 2 , ■ • ■ ,zi^i,zi}, with Zi = 0 when j < j* in 
column i in _r and Zi = 1 when j > j* in column i, can be created by every 
group of n — 1 users. 

The pirates choose one sort of the marks to 0 and the other to 1. In column 
i in F: 

— If j e {1, . . . ,j* — 1} then Zi = 0 
~ if j £ ■ ■ ■ ,n — 1} then Zi = 1 

If the ordering of the sets jj is correct, i.e. the marks are chosen correctly, 
the pirates will correctly think that F has no columns of weight j* . This means 
that no sub-columns with weight j* — 1 will have j* ones in the corresponding 
column in F. Hence all groups of n — 1 users can create the wanted z. 

If the ordering of the sets jj is the reverse the pirates will incorrectly think 
that F has no columns of weight n — j*. This will still create the same false word, 
since the choice is on the mark not on the bit. □ 

For a better understanding of the proof see the example below. 

Example 3. The pirates founds sub-columns of the code below and by using the 
strategy in the proof above creates the word below the matrix, regardless of the 
ordering. To think that a = I and 6 = 0 j* = 2 gives the same false words as to 
think that a = 0 and 6 = 1 j* = 4. 

aaaa aaaa aaaa aaaa 
bbbb aaaa aaaa aaaa 
bbbb aaaa aaaa aaaa 
bbbb bbbb aaaa aaaa 
bbbb bbbb bbbb aaaa 
bbbb bbbb bbbb bbbb 

bbbb aaaa aaaa aaaa 

Theorem 2 is true for the sorted code according to Lemma 2 so, now we know 
that there is going to be at least n — 1 columns in Fgort- This will not be enough. 
Suppose there is only one column with weight j* — 1 and one with weight j* . 
If there are two sub-columns with weight j* — 1 the pirates only have to guess 
which is the one with j* in the whole code and use the strategy in the proof 
above. The only time the pirates have to guess is when the missing word has a 
1 in the column with weight j* and a 0 in the column of weight j* — 1. In this 
case they will succeed in creating the wanted word with probability 0.5. In all 
other cases the pirates will create the wanted word with probability one. 

4.1 Properties for Groups in the Sorted Code 

We denote the pirate group missing the <:th word with Si- 

Si and Sn-i are special. This because these pirate groups will not see one of 
the sets jj, as the Marking Assumption gives that some of the positions will be 




104 



T. Lindkvist 



undetectable. This also means that they will not have two sets of 7^ that look 
the same. 

The remaining pirate groups, S2, ■ • ■ , Sn-2, will all be in the same situation. 
They will see all sets jj, but the group Si will not be able to tell the columns 
in 7i_i and the columns in % apart. The group Si will be able to order the 
7j, except for 7i_i and 7,, but the order may be reverse, due to the unknown 
embedding. 

S2 will see the codewords below, permuted and embedded of course. The sets 
of 7j can be grouped except for 71 and 72, which can not be told apart. 

S2 will not be able to tell if it is codeword 2 or n — 2 that is missing. They 
may see 70 as 71; if this is the case all jj will be seen as 7n-j- 

nil nil nil nn nn nn 
0000 0000 1111 1111 1111 1111 
0000 0000 0000 1111 1111 1111 
0000 0000 0000 0000 11111111 
0000 0000 0000 0000 0000 1111 
0000 0000 0000 0000 0000 0000 

4.2 The Strategy with a Sorted Code 

From Theorem 2 we know that every set 7^ exists, otherwise there exists a 
strategy that leaves the tracer with no better algorithm than to randomly choose 
one of the users as a pirate. 

In 7i the group of pirates has to make a majority choice, otherwise the tracer 
knows that codeword number one is in the group. A majority choice is when the 
pirates choose the bit most represented in a position. According to the Marking 
Assumption the pirates can only do things in the detectable positions and if 
codeword one is missing the positions in 71 would be undetectable and therefore 
0 . 

The same is true for 7„_i, only there the choice has to be a 1 . 

Lemma 3. In positions where there is one mark of one sort and the rest are of 
the other mark the pirate group has to make a majority choice, if all columns of 
weight 1 and n—1 are not undetectable. If not, the pirate with the single bit will 
reveal himself. 

Proof. According to Theorem 2 the code has columns where all but one of the 
marks are the same bit. The Marking Assumption says that the pirates can 
only make changes in the detectable positions. If the pirate group in one of 
these columns chooses the single bit, they will show the tracer that this position 
is detectable, since if it where undetectable a majority choice would be made 
automaticly according to the Marking Assumption. This position can only be 
detectable if the pirate with the single bit is in the pirate group and this pirate 
can be accused. □ 




Characteristics of Some Binary Codes for Fingerprinting 105 



To be able to know if all columns of weight one and n — 1 are undetectable, 
the pirates simply have to count the detectable positions and see if the number 
they get plus the number of columns of weight one and n — 1 add up to the 
length of the codewords. 

From this we know that a false word will contain only Os in ji and only Is 
in This is true for all codes, not only the sorted ones. 

For the rest of the jjS the pirates can choose to put as many Is as they like 
in a specific jj, except that the may be in reverse order and that they can 
not distinguish 7 i_i and ji. This means that the pirates can almost distribute 
the number of Is as they like in 72 to 7 n- 2 - 

A general strategy, that can always be used by a pirate group of n — 1 users: 

1. Count the number of detectable positions 

a) if it is not the same as the code length, codeword 1 or n — 1 is missing. 

b) if it is the same as the code length, 7 ^ and jj+i can not be told apart. 

This tells the pirates that codeword number j + 1 or n — (j ' + 1) is missing. 

2. Make a majority choice where there is a single mark of one sort in the 
observed sub-columns. 

3. Choose the numbers of Os and Is in the 'jjS not chosen with a majority 
choice. If there is a set with both the columns from 7 ^- and 7^+1 these must 
be considered together. 

In the third step, the distribution may depend on the pirates knowing the 
right order of the 'jjS, if so the pirates must guess the embedding, with probability 
of success 0.5. 

4.3 Permutation 

When a code is sorted the pirates will not be able to figure out the permutation 
among the columns of the same weight. In an unsorted code pirates may be able 
to match some of their columns to the columns of the code and learn things 
about the permutation, since the columns of the same weight can be told apart. 

The unsorted code will hide the 'jj , since the missing codeword will not have 
the same bit in all columns with the same weight. 



4.4 The Sorted Code in a Concatenated Code 

The sorted code can be used as an inner code in a concatenated code. The whole 
code will be permuted before embedding. This means that the columns of one 
inner codeword will be mixed with the columns from all other inner codewords. 

It can be hard for the pirates to carry out the general strategy in the last 
section on the sorted code if it is an inner code in a concatenated code. The 
columns with the same distribution will look the same for all inner codewords. 
It will not be possible to tell which columns belong to which inner codeword, 
thus in such a code the strategy can only be the same on all inner codewords. 
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4.5 Conclusions for the Sorted Code 

The sorted code makes it easier to draw conclusions at least when c = n — 1, i.e. 
the number of pirates is all but one. This since the different pirate groups have 
the same situation, except for and 5'„. 

The distributor knows exactly what the pirates have to work with and can 
adjust the tracing algorithm for that. One example is the inner code in [3] where 
the security, among other things, is built on the fact that the pirates in Si can 
not distinguish 7 i_i and 7 ^. 

In an unsorted code the situation for the pirates is different depending on 
which codeword is missing and therefore the situation is harder to analyse. On 
the other hand the pirates will not be able to find the jj for a specific column, 
since columns with the same weight in F can have both Os and Is in the missing 
codeword. 

5 Discussion 

This paper looks at two specific code structures trying to found out which pro- 
perties are good and which are bad in a fingerprinting code. 

The area of fingerprinting seems to be closely related to error correcting codes 
so it is natural to seek there for a good code structure. Binary, linear codes have 
properties that are good for error correcting. For fingerprinting they can not be 
used as they are as can be seen in Theorem 1. 

The best code we have found, in [3], has the sorted code as inner code. Ana- 
lysing this code will give us knowledge about the characteristics of a good code. 
The security of a sorted code depends much on the fact that the permutation of 
the code hides the jjS. In [3] every inner codeword has a high security, as said 
in Sect. 4.4 this probably is a too strong demand, since the jjS will mix between 
the inner codewords. 
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Abstract. Some works about an electronic auction protocol have been 
proposed[2,3,4,5,6,8,ll,12]. An electronic auction protocol should satisfy 
the following seven properties: (a)Fair of bidders; (b)Security of bids; 
(c)Anonymity; (d)Validity of winning bids; (e) Non-repudiation; (f)Ro- 
bustness; and (g)Eflicient bidding points. As for anonymity, previous 
protocols assume some entities like a dealer or plural centers to be tru- 
sted. In this paper, anonymity is realized without a trusted center, main- 
taining both computational and round complexity low. Furthermore, we 
represent a bid efficiently by using binary trees: for 2*^ bidding points, the 
size of the representation of bids is just k. Previous works investigating 
a sealed-bid auction aim at “efficiency” but not “entertainment' seen in 
English atiction[2,4,5,6,ll,12]. We introduce a new idea of entertainment 
to the opening phase by decreasing winner candidates little by little. Our 
protocol has the following three main features in addition to the above 
seven properties: perfect anonymity (a single non-trusted center), efficient 
bidding points and entertainment. 

keywords anonymity, sealed-bid auction, bidding points, entertainment, 
one-way function 



1 Introduction 

Auction is a price-decision system based on a market principle, but not a fixed 
price. An auction price would reflect a market price more clearly than a fixed 
price since it is decided by bidders. There are many different types of auction. 
An English auction is the most familiar type. In an English auction, each bidder 
offers the higher price for goods one by one, and finally a bidder who offers 
the highest price gets the goods. Each bidder participates in the price-decision 
process and enjoy it. So an English auction has a feature of entertainment as well 
as a price-decision system. A sealed-bid auction is another type, in which each 
bidder secretly submit a bid to a center only once. Therefore a sealed-bid auction 
decides the price more efficiently than an English auction. However, all bidders 
cannot enjoy the price-decision process. A sealed-bid auction would not have 
a feature of entertainment. In real(i.e. non-electronic) auction, both types are 
held and desired. On the other hand, many electronic auction protocols realize 
a sealed-bid auction[2,4,5,6,lI,12]. We note that all electronic auction aims at 
efficiency but not a feature of entertainment. 
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There are mainly three entities in an auction, a center(C), a vendor(V) and 
a bidder (Z5). This basic component is also used in an electronic auction. Each 
role is as follows: 

— Center (C): This includes an auctioneer. A center sponsors several auctions. 

— Vendor(V): Vendor wants to sell her/his goods and is registered to a center. 

— Bidder(,8): Bidder wants to buy goods and is registered to a center. 

V only requests an auction to C and communicates with neither C nor B while an 
auction is held. An auction process is conducted between C and B. The following 
are seven properties that are required in an electronic auction protocol: 

(a) Fair of bidders: all bidders can look a proper polling on Internet. 

(b) Security of bids: nobody can forge(falsify) and tap a bid. 

(c) Anonymity: nobody know the correspondence of a bidder to a bid even 
after the opening phase. Note that, in electronic auction, this does not mean 
the secrecy of loosing bids. Anonymity that dose not reveal which bidder 
except for a winner has bid at what bid can be realized even if some loosing 
bids are revealed. 

(d) Validity of winning bids: a protocol can prove that a winning bid is the 
highest or the lowest values of all bids. 

(e) Non-repudiation: a winner cannot deny that he/she submitted the winning 
bid after the bid is opened. 

(f) Robustness: even if a bidder sends an invalid bid, the auction process is 
unaffected. 

(g) Efficient bidding points: if the bidding points are set up discretely, many 
bidding points are desirable. 

In addition to the above seven properties, a sealed-bid auction requires the fol- 
lowing property. 

(h) Secrecy of loosing bids: a protocol keeps loosing bids secret. 

Apparently the secrecy of loosing bids is not required in an English auction since 
all loosing bids are revealed. Therefore the necessity of secrecy of loosing bids 
depends on targeting what electronic auction. As we will describe below, we aim 
at a sealed-bid auction with a feature of an English auction. So our protocol 
reveals only part of distribution of bids but not reveal loosing bids directly. 

Various works about electronic auction have been proposed [2,3,4,5,6,8,11, 
12]. In [8], the timing is considered when each bidder sends a bid in real-time 
electronic auction on Internet. A sealed-bid auction protocol is investigated in [2, 
4,5,6,11,12] and a second-price auction protocol is discussed in [3]. A second-price 
auction is a kind of sealed-bid auction: a bidder who offers the highest price gets 
goods in the second price. Eor anonymity, a bid[2,3,4] or the opening function]!!] 
is distributed among centers by using the secret sharing technique[13]. In this 
technique, however, anonymity on the correspondence of a bidder to a bid should 
leak out by a dealer[2] or a collusion of centers forming a quorum[3,4,ll]. On 
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the other hand, the scheme[6] cannot satisfy the anonymity for the center, in 
which the secret sharing technique is not used. To sum up, the previous protocols 
assume that some entities like a dealer or centers to be trusted. Usually plural 
centers require more communication cost[2,3,4] or more computation amount[ll]. 
On the other hand, [5,12] realize anonymity without a trusted center, however 
unfortunately both computational and round complexity to bidders are rather 
high in the opening phase. [5] uses not a public key cryptosystem but a one- 
way hash function by introducing the way of “Pay Word” [10], which exceedingly 
decrease the computational complexity. Although such a technique is used, high 
round complexity to bidders is required for anonymity without a trusted center. 
In this paper, anonymity on the correspondence of a bidder to a bid is realized 
without a trusted center, maintaining both computational and round complexity 
low. In a sense our protocol realizes “perfect anonymity” , and also realizes “non- 
trust ed center” . 

The bidding points are set up discretely in advance in order to realize an 
anonymity[3,4,5,ll,12]. Therefore the more bidding points are set up, the less 
probability of tie decreases. In [4], the size of representation of bids directly 
depends on the number of bidding points: for k bidding points, the size of the 
representation of bids is just k. Therefore the more bidding points are set up, 
the more communication amount is required in the bidding phase. Although 
the bidding points are expressed rather efficiently by logarithm expression [3], 
both protocols[3,4] can handle neither tie bids nor invalid bids well: they cannot 
specify the winners or how many winners there are if the same winning bids 
or invalid bids are submitted. On the other hand, in [11] a bid is expressed 
efficiently as an encryption of a known message, which does not depend on 
the number of bidding points. Therefore it improves the representation of bids. 
However, unfortunately it costs much computation time in the opening phase: it 
repeats n times decryption of ElGamal or RSA cryptosystems until the winning 
bids are decided, where n is the number of bidders. Apparently it is not suited 
for handling many bidders. In this paper, a bid is represented efficiently by 
using binary trees: for 2^ bidding points, the size of the representation of bids 
is just k. Furthermore, the computational and round complexity in the opening 
phase depends on only (probabilisticly) k, but not the number of bidders. Our 
protocol can well handle both tie bids and many bidders, and also represents a 
bid efficiently for many bidding points. 

Up to the present, all auction protocols[2,3,4,5,6,8,ll,12] aim at realizing 
sealed-bid auction faithfully, whose concern is “anonymity” and “efficiency” . 
Entertainment seen in a real English auction has not been discussed before. In 
this paper, we introduce a new idea of entertainment to the opening phase by 
decreasing winner candidates little by little. Our price-decision process looks 
like a winner-decision process in lottery tickets. Note that the computational 
and round complexity in the opening phase is negligible low, which depends on 
only (probabilisticly) k for 2^ bidding points, but not on the number of bidders. 

Our electronic auction protocol satisfies the above seven properties. Main 
features in our protocol is as follows: 
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— Perfect anonymity with low computational and low round com- 
plexity (a single non-trusted center): Perfect anonymity means that 
nobody(including a center) can identify a bidder for her/his bid except for 
a winning bid even after the opening phase. Our protocol realizes perfect 
anonymity with both low computational and low round complexity. 

— Efficient bidding points: a bid is represented effieiently by using binary 
trees: for 2^ bidding points, the size of the representation of bids is just k. 

— Entertainment: Entertainment means that many bidders can enjoy the 
opening phase by decreasing winner candidates little by little. 

This paper is organized as follows. Section 2 summarizes as a previous work. 
Section 3 explains our basic model and presents two practical schemes, one based 
on DTP and the other based on a one-way hash function. Section 4 investigates 
attacks against our scheme. Section 5 discusses the properties of our protocol. 
Section 6 presents performance of our protocol. 



2 Previous Work 

In this section, we summarize the outline of [4] and discuss the weaknesses. 



2.1 Outline 

First C sets L bidding points {vi, ■ ■ ■ ,Vt, ■ ■ ■ ,vl} and L encryption function 
{El ■ ■ ■ ,Ef ■ ■ , El} according to each bidding point. C keeps each inverse func- 
tion {Ef^}. A winner who bids the highest value gets goods in the highest value. 
We describe how Bi with an identity information IDi places a bid Vb.. The bid 
vector Mi{t) for Bi is as follows: 



M,{t) 



Et{IDi) if bi < t, 

0 otherwise. 



We explain how to find the highest bid from the bid vectors. Given bid-vectors 
of all bidders, each element in the same bidding point are added to sum-vector 
M{t): 



M{t) = Y. M,{t) (1 < t < L). (1) 

i 

If M(t) is zero, it means that nobody bids Vf. The winning bid Vt is given by the 
first t that M{t) is non-zero. If only one bidder bids Vt{a winning bid), he/she 
is identified by IDi using the inverse function Ef^. 

This protocol uses secret sharing technique[13] since C can know the corre- 
spondence of a bidder to a bid from Mi{t). Each bid vector Mi{t) is distributed 
among centers in order to keep all bids secret against centers. 




112 K. Omote and A. Miyaji 



2.2 Weaknesses 

There are four weaknesses in [4]. 

— Anonymity on the correspondence of a bidder to a bid should leak out by a 
collusion of centers forming a quorum. 

— This protocol can handle neither tie bids nor invalid bids well; they cannot 
specify the winners or how many winners there are, if the same winning bids 
or faulty bids are submitted. 

— The size of representation of bidding points directly depends on the number 
of bidding points: for k bidding points the size of the representation of bids 
is just k. 

— A winner is decided as soon as sum-vector M is computed. Therefore any 
bidders cannot enjoy the opening phase. 



3 Our Protocol 

We propose an auction protocol which satisfies a perfect anonymity on the cor- 
respondence of a bidder to a bid except for a winning bid even after the opening 
phase, efficient bid representation by using binary trees, and a feature of enter- 
tainment in the opening phase. For simplicity, we assume the winners to be the 
one who bids the highest value among a set of bidding points. 

3.1 Explanation of Notations 

Notations are defined as follows: 

n : number of bidders 

k : number of bid class 

L : number of bidding points (L = 2^) 

i : an index for Z? (i = 1, • • • , n) 

Vi, Ri : a random number for Bi 
Xi : a secret key for Bi 

yi : a public key for Bi 

Mi : a bid vector for Bi 

/(•) : a one-way function (e.g. DTP, a hash function) 

3.2 Preliminary 

— Initialization: C sets up a one-way function / and publishes / to all B. 

— Requesting by vendor: V requests an auction to C to sell her/his goods. 

— Entry of bidders: Before starting an auction, bidders which want to buy 
goods execute the following procedure: first make a pair of secret key Xi and 
public key j/i, send to C and get its certificate by a center. 

— Setting up of bidding points: C sets up L = 2^ bidding points for goods 
requested by L. 
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Class 1 Class 2 Class 3 




Fig. 1. Example of Bidding Points 



3.3 Bidding Points 

A binary number denotes the value of a bidding point. For example, we see that 
eight bidding points are given by three classes in Figure 1. Generally, there are 
2^(= L) bidding points for k classes. Note that a bid is represented by a bid 
vector Mi, whose size depends on only A:. As a result, it is possible to handle 
many bidding points. 

In our protocol, bidding points have two properties as follows: 

1. In a representation of a bid, a binary number 0 or 1 expresses whether a bid 
opens the next class or not. 

2. A binary expression can set up more bidding points. This can reduce the 
probability of tie. 

3.4 Bidding Phase 

A bid sent by Bi to C is represented by a bid vector Mj. The format of is 
defined as follows: 

Mi = [cZass 1, class 2, ■ • • , class k, IDi] . 

The bid vector M^ consists of the value expressing 0 or 1 in each class and 
the identification information of Bi in the last row. This IDi cannot be opened 
unless Bi is a winner candidate. Therefore anonymity even for C is satisfied. M^ 
is opened from class 1 to IDi one by one. By using IDi, we can confirm who 
places a highest bids. We explain how Bi places a bid. For simplicity, Bi places 

t k—1 

a bid Uft. = (1 • • ■ 1 0 1 • ■ • 11 01 ) that t-th and {k — l)-th bits are 0. Then bid 
vector Mi is as follows: 

f^{ri) + Ri,i, f(ri)+ri, r; + Xi] 
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Here we denote s-th row of by Mi^s (1 < s < k + 1). 

Step 1. Bi generates a random number n and computes /(r’i), • • • , by 
using a one way function / and Vi. 

Step 2. Bi constructs a bid vector corresponding to Vb-. 



(1 < s < fc) Mi., 



( (if s-th of Vbi = 1) 

\ -b Ri,k-s (if S-th of Vbi = 0), 

Ti +Xi, 



where Ri^k-s is a random number and Xi is Hj’s secret key. 

Step 3. Bi has to keep {rj, /(rj), • • • , /^(rj)} secret, but has to posses only 
{fHri)j''~\ri)J{ri)} as opening keys. 

Step 4. Bi sends to C, where Mj does not need to be encrypted, because 
Bi keeps the opening key /^(r,) secret to conceal the value of Vb-. 

Our bid vector has the following features: 

1. Anonymity of the correspondence of a bidder to a bid is satisfied as long as 
opening keys are kept secret. 

2. If one opening key is posted, then each row of is opened one by one till 
the row corresponding to “0” in a bid. However the row next to “0” in a bid 
is never opened as long as the next opening key is kept secret. 

3. Everybody can verify the validity of a bid vector by checking /^~®“''^(ri) = 
/(/^~®(ri)), both of which are open to everybody. 

4. Bid vectors for only bidders who place the highest bid are opened one by 
one till the last row, in which their secret key is set. Furthermore everybody 
can confirm that the validity of both the highest bid and winners. 



3.5 Opening Phase 

This section presents the opening phase in our protocol. First C opens both each 
bid vector and each public key for bidders on Internet. Note that nobody gets 




Fig. 2. Opening Example 
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any information about the correspondence of a bidder to a bid. For simplicity, 
we assume that a bid Vb^ for Bj in section 3.4 is the highest in this auction. 



[Step 1] Each Bi sends the first opening key /^(cj) to C. Then each bid vector 
Mi is opened till the row corresponding to “ 0 ”, while 

is confirmed. On the other hand, everybody can confirm 0 of f-th row in by 
checking 

[Step 2] Only bidders Bi whose bid vectors are opened to the lowest bid send 
the next opening key (e.g. M3 in Figure 2). In this case, the next opening key is 
/^~*(ri). In the same way as Step 1, this procedure continues till the last row. 
Note that Bi^s secret key is not opened as long as Bi keeps the final opening key 
secret. 

[Step 3] Everybody can confirm that Bj is the winner of bid vector Mj by 
checking a pair of public key i/j and the secret key Xj, which is revealed in the 
last row. 



3.6 Schemes Based on a Practical One-Way Function 

We will present two examples of one-way function /, one is DLP[1] and the other 
is hash function. 



[DLP] C selects a large prime p and g £ Z* with prime order q. Then a one-way 
function / is set to /(r) = 9’’ {mod p). 

[One-way hash function] Let h{-) be a cryptographically strong hash function 
such as SHA-1[7] or MD5[9j. Then a one-way function / is set to /(r) = h{r) in 
the same way as ”PayWord”[10]. 

4 Attacks 

This section discusses some attacks against our protocol. 

4.1 Invalid Bid Vector 

We investigate that any invalid bid does not have an influence on the auction 
proceedings. Figure 3 shows two types of invalid bid vector: 

1. a bidder does not embed her/his secret key into the last class in a bid vector 
[Figure 3-Ma], 

2. a bidder does not embed the proper opening key into a bid vector [Figure 
3-M4]. 
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Fig. 3. Examples of invalid bid 

First we discuss the case 1. Unless M 3 is a winner candidate, there is no 
problem: M 3 is simply ignored. If M 3 is a winner candidate like Figure 3, nobody 
can identify because M 3 is not embedded , 83 ’s secret key. In such a case, 
M 3 is simply removed from this auction as an invalid bid. In our protocol, a bid 
vector is opened from the highest bid. Therefore the auction proceedings may 
just continue except for an invalid bid vector. 

Next we discuss the case 2. Both Mi and M 4 are winner candidates except 
for M 3 . However, nobody can open the class 4 of M 4 since M 4 is not embedded 
into the proper opening key in the class 4. In such a case, H 4 is also ignored. 
Therefore Mi is an only winner candidate. The opening phase continues except 
for M 3 and M 4 . 

In our protocol, we cannot identify the invalid bidders in the same way as [3, 
4,5,11,12]. However our protocol has a feature that each bid vector of bidders is 
independently opened. Therefore even if an invalid bidder places a bid vector, 
the auction proceedings will be unaffected: all invalid bids are simply ignored. 
So our protocol satisfies disturbing resistance, i.e. robustness. 

4.2 Group Collusion 

We investigate group collusion attacks by some bidders: attackers want to get 
goods in the lowest price available. For simplicity, let {/i,/ 2 ,/ 3 } be an invalid 
group. There are three cases of group collusion seen in Figure 4, which expresses 
a part of binary tree in bids: 

Case l(Figure 4-(a)): there are only plural invalid bidders Ii, I 2 and in higher 
trees (Tree 1). 

Case 2(Figure 4-(b)): there are only one invalid bidder 7i in higher trees. 

Case 3(Figure 4-(c)): There is no invalid bidder but there are some valid bidders 
Vi, V 2 and V 3 in higher trees. 

In the case 1, attackers can get goods in the lowest bid of I 3 , by canceling two 
bids of I\ and l 2 - However in both case 2 and case 3, it is impossible for attackers 
to control the winning bid. Furthermore, if an attacker Ii places the highest bid 
of all bid points, then Ii cannot deny the winning bid after the opening phase. To 
sum up, attackers can control the winning bid only in the case 1 since attackers 
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Fig. 4. Group Collusion 



cannot get goods in the lower price than that of valid bidders. Therefore such 
collusion attackers have little influence on the auction proceedings. 

C may solve this group collusion by placing some bids randomly since C makes 
it hard for invalid bidders to control a winning bid. Of course C should place 
some bids near the winning bid. 



5 Properties 

Our protocol satisfies the following properties: 

— Fair of bidders — All bidders can look a proper polling on Internet. 

— Security of bids — Security of bids means that: 1. Before the opening, 
a bid cannot be revealed. 2. Any bidder can check whether her/his bid is 
not forged. In our protocol, each row of a bid vector consists of two random 
numbers /(r^) + and rj + r' by using a one-way function / and a random 
number rj and r'. As for the former, ri is kept secret as long as /(r^) is not 
opened, whose security depends on /. As for the latter, r ■ is chosen randomly, 
and Ti is kept secret as long as the next row is not opened. Therefore the 
security also depends on /. The security on attacks of using all row data in 
a bid vector also depends on /. On the other hand, if a bid is falsified, then 
the corresponding bidder can easily notice the faulty bid since all bid vectors 
are opened on Internet. 

— Anonymity — In our protocol, only a winner’s secret key is revealed, which 
identifies the corresponding bidder. On the other hand, other secret keys are 
kept secret even after the opening phase. As a result, nobody (including a 
center) can know the correspondence of a bidder to a bid except for a winner. 

— Validity of winning bids — Since bid vectors are opened one by one from 
the higher bid, apparently a winning bid is the highest of all bids. Moreover 
the validity of a bid vector is easily checked by a one-way function and secret 
key. 
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Table 1. Performance 





|Total communication amount (bit) 


Each bidder’s computation amount] 


Bidding 


Opening 


Bidding 


Opening 


[4] 


1024(2'= - l)mn 


0 


V ■ 2'= 


0 


DLP 


1024(fc-f- l)n 


1024(2 - ^)n 


V-k 


0 


Hash 


160(fc -1- l)n 


160(2- ^)n 


H-k 


0 



— Non-repudiation — A winner Bj cannot deny her/his bid since Bj’s secret 
key is revealed. 

— Robustness — Our protocol has a feature that each bid is independently 
opened. Therefore if invalid bids are placed, the auction proceedings will be 
unaffected: invalid bids are simply ignored. 

— Entertainment — English auction has an entertainment that it does not 
only decide a winner but also pleases all participants until the winner is 
decided. In our protocol, we introduce a feature of entertainment to the 
opening phase by decreasing winner candidates one by one, which looks 
like a winner-decision process in lottery tickets. Since we aim at a feature of 
entertainment, our protocol reveals only part of distribution of bids. However 
our protocol does not reveal the whole distribution of bids like [2,6], and what 
is still better, satisfies anonymity. 

— Tie — In electronic auction protocol, bidding points are often set discretely 
[3,4,5,11,12]. In such a situation, two properties should be required: 1. Win- 
ners should be specified even if two or more bidders place the same winning 
bid. 2. The probability of tie should be reduced by setting many bidding 
points. As for the former, our protocol can specify the winners in the case of 
the same winning bids. As for the latter, our protocol can set many bidding 
points like 2^, maintaining computational, communicational and round com- 
plexity of Bi or C low, all which depend on k. Furthermore the probability 
of tie can be reduced. 



6 Performance 

In this section, we compare our protocol with [4] from the point of view of 
communication and computation amount, which are shown in Table 1. Here let 
the number of bidding points and bidders be 2^ and n, respectively. In [4], plural 
centers are required, whose number is denoted by m (> 2). We assume a one-way 
function / to be DTP or a 160-bit output one-way hash function, whose output 
size is denoted by |/|. In [4], the communication amount in the bidding phase 
depends on 2^ and m. On the other hand, in our protocol the communication 
amount depends on only k since only a single center is sufficient for anonymity. 
Therefore the communication amount can be dramatically reduced. 

Next we discuss the communication amount in the opening phase. Since we 
aim at a feature of entertainment, communication between Bi and C is required 
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in the opening phase. But we will see in Table 1 that the communication amount 
in the opening phase is negligible small. For simplicity, we assume that there are 
^ bidders in each branch of class i on the average, who send an opening key in 
probability Therefore the communication amount in the opening is at most 

I/I (^+E5)=i^i 

Lastly we discuss the computation amount. Let V be computation amount 
to compute the distribution information of the secret sharing technique, T> be 
computation amount to compute a modulus exponent, and % be computation 
amount to compute a one-way hash function. In the same discussion as the 
communication amount, the computation amount of [4] depends on 2^ while 
that of our protocol depends on only k. 

7 Conclusion 

We have proposed an anonymous auction protocol with a single non-trusted 
center. Our protocol realizes the following features: 

Perfect anonymity with low computational and low round comple- 
xity : Nobody can identify a bidder from her/his bid except for a winning 
bid even after the opening phase. 

Efficient bidding points : For 2^ bidding points, the size of the represen- 
tation of bids is reduced to just k by using binary trees. 

Entertainment : Many bidders can enjoy the opening phase by decreasing 
winner candidates little by little. 

Robustness : Even if a bidder sends an invalid bid vector, the auction 
process is unaffected. 

Application : Our protocol can be easily applied to a power auction, which 
decides the plural winners. 
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Abstract. Recently, some divisible electronic cash (e-cash) systems 
have been proposed. However, in existing divisible e-cash systems, ef- 
ficiency or unlinkability is not sufhciently accomplished. In the existing 
efficient divisible cash systems, all protocols arc conducted in the or- 
der of the polynomial of log A where N is the divisibility precision (i.e., 
(the total coin amount)/ (minimum divisible unit amount)), but paym- 
ents divided from a coin are linkable (i.e., anyone can decide whether 
the payments are made by the same payer). The linked payments help 
anyone to trace the payer, if N is large. On the other hand, in the exi- 
sting unlinkable divisible e-cash system, the protocols are conducted in 
the order of the polynomial of N, and thus it is inefficient for large N. 
In this paper, an unlinkable divisible e-cash system is proposed, where 
all protocols are conducted in the order of (log A)^. 

Keywords: Electronic cash. Divisibility, Unlinkability, Group signature 



1 Introduction 

As the core to realizing the electronic commerce, the electronic cash (e-cash) is 
in great demand. In e-cash systems, a customer withdraws electronic coins from 
a bank, and the customer pays the coins to a shop in the off-line manner. The 
off-line means that the customer has no need to communicate with the bank or 
a trusted third party during the payment. Finally, the shop deposits the paid 
coins to the bank. 

To protect the privacy of customers, each payment should be anonymous, 
and furthermore unlinkability should be satisfied. The unlinkability means that 
any other one except the trusted third party cannot determine whether two 
payments are made by the same customer. In linkable anonymous e-cash systems, 
the linked payments enable the other one to trace the payer by other means (i.e., 
correlating the payments’ locality, date, frequency, etc.), as noted in [1]. 

In practice, it is desirable that e-cash systems are divisible, which means that 
payments of any amount up to the monetary amount of a withdrawn coin can 
be made. Hereafter, let N be (the total coin amount)/ (minimum divisible unit 
amount). N indicates the divisibility precision, and thus N needs to be large 
from the viewpoint of convenience. For example, when the total coin amount 
is $1000 and the minimum divisible unit amount is 1 cent, N is about 2^^. 
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Therefore, the computational complexity for N is an important criterion in the 
divisible e-cash systems. In [2,3], the efficient divisible e-cash systems where 
all protocols are conducted in 0{poly(log N)) are proposed, where poly means 
the polynomial. However, these systems in [2,3] do not satisfy the unlinkability 
among the payments derived from the same coin. Thus, the larger N grows, the 
more easily the payer may be traced owing to the linked payments. 

In [4], as a variant of divisible e-cash systems, an electronic coupon (e-coupon) 
system is proposed. In this system, a withdrawn coin, called a ticket, is divided 
into sub-tickets, and only the sub-tickets can be spent. The advantage of this 
system is to satisfy the unlinkability among all payments including ones from 
the same ticket, and thus both divisibility and unlinkability hold. On the other 
hand, the computational complexity requires 0{poly{N)). 

In this paper, a divisible e-cash system is proposed, where (1) the unlinka- 
bility among all payments holds, and (2) the computational complexity of all 
protocols is 0((logY')^). This e-cash system is based on the above e-coupon 
system. In the e-coupon system, the payment is accomplished by proving the 
ownership of a withdrawn ticket, which is the bank’s digital signature, without 
revealing the ticket. Furthermore, to detect over-spending, the payer is forced 
to send values which are the same if and only if the payer uses the same sub- 
ticket. In the divisible e-cash system of this paper, the binary tree approach is 
adopted to realize O{poly{log N j) computational complexity as well as the divi- 
sible e-cash systems [2,3]. In this approach, a withdrawn coin has a binary tree, 
where the root represents the monetary amount of the coin and the other nodes 
represent the half of the amount of the parent node. In addition to the proof of 
the ownership of the coin as well as in the e-coupon system, the payer is forced 
to send values which are linked if and only if the nodes with the parent-child 
relationship are used for payments or the same node is used twice or more, which 
implies over-spending. 

This paper is organized as follows: Section 2 describes a model and requi- 
rements for a divisible e-cash system. In Section 3, the binary tree approach 
and cryptographic primitives used in the proposed e-cash system are shown. In 
Section 4, a divisible e-cash system satisfying the requirements based on the mo- 
del is proposed. Section 5 discusses the security and efficiency of the proposed 
system. Section 6 concludes this paper. 

2 Model and Requirements 

We adopt the model of “escrow cash” [5] to protect illegal acts of anonymous 
customers. In this model, trusted third parties, called trustees, participate in the 
system. The trustees cooperatively can revoke anonymity of payments to protect 
the illegal acts as money laundering, blackmailing attack [6] and so on. Though, 
in this paper, one trustee has the authority of the revocation for simplicity, it is 
easily extended into the the model of multiple trustees by using the threshold 
cryptosystems [7]. 
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The requirements for divisible e-cash systems are as follows [2,5]: 

Unforgeability: A coin and a transcript of a payment can not be forged. 

No over-spending: The customer who over-spends a coin can be identified. 
No swindling: No one except the customer who withdraws a coin can spend 
the coin. The deposit information can not be forged. 

Anonymity: No one except the payer and the trustee can trace the payer from 
the payment. 

Unlinkability: No one except the payer and the trustee can determine whether 
any pair of payments is executed by the same customer, unless the payments 
cause over-spending. 

Anonymity revocation: Anonymity of a transcript of a payment can be revo- 
ked only by the trustee and when necessary, where the following revocation 
procedures should be accomplished: 

Owner tracing: To identify the payer of a targeted payment. 

Coin tracing: To link a targeted withdrawal of a coin to the payments 
derived from the coin. 

Only the transcript for which a judge’s order is given must be de-anonymized. 
Divisibility: Payments of any amount up to the monetary amount of a with- 
drawn coin can be made. 

Off-line-ness: During payments, the payer communicates only with the shop. 



3 Preliminaries 

3.1 Binary Tree Approach 

In the proposed e-cash system, the binary tree approach is adopted to accom- 
plish the divisibility as well as the divisible e-cash systems in [2,3]. Thus, before 
describing our system, we review this approach. 

Each coin of w = 2^“^ worth is assigned to a binary tree of I levels. Each node 
of the tree is assigned to a denomination. The root node, denoted uq, indicates 
the monetary amount w of the coin, and any other node (2 < u < £) 

indicates half of the amount of the parent node where ji = 0 and 

ji e {0, 1} for i = 2, . . . , u. A binary tree of 3 levels is illustrated in Figure 1. 



value of each node 
w 

w / 2 

w/4 




Fig. 1. A binary tree of 3 levels 
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To accomplish the divisibility, the following rule is used: 

Divisibility rule: When a node is used, all descendant and ancestor nodes are 
not used. Furthermore, any node is not used multiple times. 

This rule is satisfied if and only if over-spending is protected [2]. 

In the proposed system, each node possesses a value which is 

called F value. F value Fj^ of the root node is proper to the coin, and F value 
- of other node is applied by a one-way bijection for F value F^i - 
of the parent node. Thus, the nodes which have the parent-child relationship can 
be linked by a sequence of bijections, while the nodes without the relationship 
does not have such a sequence. Through this link, over-spending can be detected, 
and the over-spender can be identified by the trustee. 

3.2 Signatures Based on Zero-Knowledge Proofs of Knowledge 

As well as the e-coupon system [4] , the proposed e-cash system uses the extension 
of the group signature scheme of [8] , where, as primitives to prove the knowledge 
of secret values without leaking any useful information, signatures based on zero- 
knowledge proofs of knowledge (SPK’s) are used. Since the proposed system 
also uses some types of SPK^s, this subsection reviews the SPK's. These are 
converted from zero-knowledge proofs of knowledge (PK’s) by the so-called Fiat- 
Shamir heuristic [9]. That is, the prover determines the challenge by applying 
a collision-resistant hash-function to the commitment and the signed message 
and then computes the response as usual. The resulting signature consists of 
the challenge and the response. Such SPK’s can be proven to be secure in 
the random oracle model [10] given the security of the underlying PR’s. Let 
SPK{{a, P, ■ ■ ■) : Predicates} (m) be the signature on message m proving that 
the signer knows a,/3 , ... satisfying the proven predicates Predicates. In this 
notation, Greek letters denote the secret knowledge and the other letters denote 
public parameters between the signer and the verifier. In the proposed system 
as well as the group signature scheme [8] which are based on the hardness of the 
discrete logarithm problem, the relations among the discrete logarithms from 
cyclic groups are used as the proved predicates. In the following, let G and Gi 
be cyclic groups with order q and qi, respectively. The discrete logarithm of 
y e G to the base z e G is x e Zq satisfying y = 0^ if such an x exists. We 
denote x = log^ y. This is extended to the representation of y G G to the bases 
zi, Z2, . ■ ■ Zk G G which is X\,X2, ■■ - Xk G Zq satisfying y = 0“^ • z^’^ • • ■ 0^'“ if such 
Xi’s exist. The double discrete logarithm of yi G Gi to the bases 01 G Gi and 

(z^ ) 

z e G is X e Zq satisfying yi = z} if such an x exists. The e-th root of the 
discrete logarithm of y G G to the base 0 G G is a; G satisfying y = if 
such an x exists. 

The first type of SPK is the signature proving the knowledge of represen- 
tations of yi, . . . , yu, G G to the bases 01, . . . , 0„ G G on message m, and it is 
denoted as 

SPK{{ai, . . . , a„) : (yi = [1^=1 ) A • ■ • A (y^ = ^ 0“^”^ )}(m). 
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where constants U G {1, . . . t>} indicate the number of bases on representation of 
Hi, the indices G {1, . . . , u} refer to the elements ai, . . . , and the indices 
hij G {1, . . . , u} refer to the elements Z\,. . . ,Zy. For example, SPK{{a, (3) : y\ = 
l\y’i = ^ 1 ^ 2 } ijn) is a SPK on m of an entity knowing the discrete logarithm 
of yi to the base Zi and a representation of j/2 to the bases zi and 22 j where the 
Z 2 "Part of this representation equals the discrete logarithm of yi to the base 21 . 
The second type is a SPK proving the knowledge of the e-th root of the discrete 
logarithm of y G G to the base 2 G G on m, and is denoted as 

SPK{f3 : y = z^ 

The third type is a SPK proving the knowledge of the e-th root of the 22 -part 
of a representation of y G G to the bases 21, 22 G G on m, and is denoted as 

SPK{{'y,5) : y = 2 ^ 2 f }(m). 

The efficient constructions of these types of signatures are concretely described 
in [8], 

The fourth type is a SPK proving the knowledge of the discrete logarithm 
of y G G to the base z £ G and the double discrete logarithm of yi G G\ to the 
bases z\ G G\ and 2 G G on m, where the discrete logarithm of y to the base 2 
equals the double discrete logarithm of yi to the bases Z\ and 2 . This is denoted 
as 

SPK{t : y = 2 *^ A yi = z^^ 

This is described in [11]. Note that there is a difference between this type of 
SPK used in this paper and that in [11]. The difference is the orders of Gi and 
G. The orders of G\ and G in this paper are different, and are prime or not 
prime, though the orders in [11] are prime. This difference does not affect the 
proof that the underlying PK is zero-knowledge proof of knowledge. Since the 
construction in [11] utilizes a cut-and-choose method, this is less efficient than 
the constructions of the other types which do not utilize the method. 

4 An Unlinkable Divisible Electronic Cash System 

In this section, an unlinkable divisible e-cash system is constructed by using the 
extension of the group signature scheme [8], as well as the unlinkable e-coupon 
system [4] . The group signature schemes allow a group member to anonymously 
sign on a group’s behalf. Furthermore, the anonymity of the signature can be 
revoked by the trusted party. In the scheme of [8] , the group consists of owners of 
unforgeable certificates issued from the group manager. In the e-coupon system, 
the certificate is used as a ticket issued from the bank and the group signature 
is used as a transcript of a payment. This simple replacement brings the system 
the anonymity, unlinkability, unforgeability, no swindling, off-line-ness, and ow- 
ner tracing of the anonymity revocation. Furthermore, in the e-coupon system, 
mechanisms to detect the payments derived from the same sub-ticket and to 




126 T. Nakanishi and Y. Sugiyama 



enable coin tracing are added. The former mechanism is that a payer is forced to 
send values which are the same if and only if the payer uses the same sub-ticket. 
The latter mechanism is that, in a withdrawal, a customer is forced to send the 
encryption of a value, which is linked to payments derived from the withdrawal, 
with the trustee’s key. In the proposed divisible e-cash system, this mechanism 
of coin tracing is adopted, and the mechanism to detect over-spending is mo- 
dified as the payer is forced to send values which are linked if and only if the 
nodes with the parent-child relationship are used for payments or the same node 
is used twice or more. In the concrete, the payer sends F value for the 

payment of the node as noted in Section 3.1. 

Assume that each participant publishes the public key of any digital signature 
scheme and keeps the corresponding secret key. Hereafter, except in the payment 
protocol, the values sent from each participant are signed on the digital signature 
scheme. Furthermore, assume that all customers and shops open their accounts 
on the bank. Let 0 be the empty string. If A is a set, a A means that a is 
chosen at random from A according to the uniform distribution. Let {g) be a 
cyclic group with generator g. 



4.1 Setup 

To set up the cash system, the bank and trustee generate public and secret 
keys. The public keys described in this setup protocol are assigned to a single 
monetary amount w = 2^“^. If multiple monetary amount is adopted, the setup 
is executed for each monetary amount. 

1. The bank computes an RSA modulus n, two public exponents 61,62 > 1, 

and two integers /i,/2 > 1- Note that ei,62,/i and /2 must satisfy that 
solving the congruence fix^^ + /2 = v'^'^ (mod n) is infeasible. The choi- 
ces for ei,62,/i and j'2 are discussed in [8]. Then, the bank computes a 
cyclic group G„ = {gn) of order n which is a subgroup of Z*^ for a prime 
P2 = 2 n + 1 . Similarly, the bank computes a cyclic group Gp. = {gpG of 
order pi which is a subgroup of for a prime Pi+i = 2 pi + 1 with all 

i {2 < i <£). In these cases, the bank redoes the above procedure from 
the computation of n if 2n -f l,2p2 + 1, ■ • - j or 2 p£ -|- 1 is not prime. Fur- 
thermore, the bank chooses elements h, h e G„, h(2,o); ^(2,1) G Gp^,. ■ ■ , 
h{e,o), ^(£,1) G Gp^ whose discrete logarithms to the bases gn,gp2J ■ ■ ■ i 9 pt 
unknown, respectively. Note that Gp^ , ■ ■ ■ , Gp^ are constructed so that fun- 
ctions h^2 0) ’ ^(3 0) ’ ' ’ ' ’ ^(£ 0) ’ ^(2 1) ’ ^(3 !)’■■■’ ^(£ 1) inputs Xn Gt^, 
Xp2 e Gp2, . . . ,Xpf -^ e Gpf. -^ can be defined well as one-way bijections. Fi- 
nally, the bank publishes y = {n, Ci, 62, /i, /2, G„, Gp^,..., Gp^,gn, gp^,-- ■, 
gp^, h, h, ^(2,0)) • • • ) ^(£,0)1 ^(2,1)) ■ • -j ^(£,1)) the public key, and keeps the 
factorization of n secret. 

2. For all i {0 < i < i — 1) and all J G {0, 1}®, the bank makes database Fqj 
empty, which holds F values included in the transcripts of payments using 
the node noj to detect over-spending in the below deposit protocol. 
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3. The trustee chooses p Z* to compute yR = h^. Then, the trustee makes 
Pr public, and keeps p secret. 

4.2 Withdrawal 

To withdraw a coin, a customer conducts the following protocol with the bank. 
This is the same as that of the e-coupon protocol, which corresponds to the issue 
of a membership certificate in the group signature scheme. 

1. A customer chooses x <^r Z* to compute y = mod n and 0 = Then, 

the customer chooses ri, r 2 6_r to compute y = (/ 12 /+/ 2 ) mod n, Ci = 

, C 2 = y^R - Furthermore, the customer computes the following SPK’s: 

Vi = SPK{a:z = gf'm, 

V2 = SPK{p:gy = {zf^gtf^^m, 

1/3 = SPKli'r, 5) : Cl = AC2=ylAz = 5^}(0). 

The customer sends the bank (y, 0 , Ci, C 2 , Vi, V 2 , ^ 3 )- 

2. If Vi, V 2 and V 3 are correct, the bank sends the customer v = mod n 
and charges the customer’s account the amount w. 

3. The customer computes v = vjr\ mod n to obtain the coin {x,v), where 

V = (/i.x®i -I- (mod n). 



4.3 Payment 

Assume that each shop owns a unique identifier. Let m be the concatenation of 
the identifier of the shop obtaining the payment and the time when the payment 
is made. In the payment protocol, the customer pays the shop any amount 
w {< w = 2^^^). Let • • • thi] be the binary representation of w. Then, if 
Wi-u+i = 1 (1 < u < ^), the customer pays a node among the nodes 

in the u-th level that do not violate the divisible rule, as well as [2]. Here, the 
payment protocol for a node is shown. By executing this payment protocol 

for multiple nodes parallel, the payment for any amount is accomplished. 

During the payment, the payer sends the bank F value of the paid node 
together with the group signature. F value of the root node, denoted is 11^. 

F 

F value of a node denoted is where is F value 

of the parent node. F values of a binary tree of 3 levels are illustrated in Figure 2. 
The detailed payment protocol is as follows: 

1. The customer computes C\ = and C 2 = yjj for f Z*. Furthermore, 
the customer computes the following SPK^s: 

Hi = SPK{{a,(3) : ^}(m), 

F2 = SPK{{A,d) : Ctgt = 

1/3 = SPK{{e, C) : Cl = h^gi A C 2 = y|j}(m). 
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Fig. 2. F values of a binary tree of 3 levels 



Note that these are the same as the group signature in [ 8 ] . Then, the customer 
conducts the followings according to the level u of the paid node: 

Case of u = 1 : The customer computes F value of the paid root node, Fj^ = 
IF , and the following SPK’s which proves the validity of F value: 

V 4 = SPK{(rj, 0) : Fj, =h'^ A Ci= 



Then, the customer sends the shop A = {ji, Fj^,Ci,C2, ^1,^2,!^, F4) 
as the payment. 

Case of u = 2 : The customer computes F values of the root node and the 

F 

paid child node, Fj^ = = ^(2^2)' customer computes 

9n = -Fi = for fi Z*, and the following SPK's which proves 
the validity of Fj^j^\ 

V4 = SPK{(?^, 0 ) : Fi = A Cl = h^gnM, 

V5 = SPKU ■■F=gLA Fj,j, = 



Then, the customer sends the shop A = (jij2, Fj^j^, Fj^, Ci, C2, Vi, 
V2,Vs, V4, V5) as the payment. 

Cases of 3 < u < £: The customer computes F values from the root node to 



F 

F- ■ — 

^Jij2 - ^'-(2,12) 



,F, 



the paid node Fj^ = 

Then, to commit F values of nodes except the paid node, the custo- 






F- 

- 



mer computes 



5 p„_i — gp^-iFi 



gl,F2 = 



! ffp 2 ~ 5 p 2 I 

) • • • ! Fu-i = for h eR z*,f2 &R z*^,...,fu~i 

Furthermore, to prove the validity of F value of the paid node by using 
the committed F values, the customer computes the following SPK’s: 






C4 = SPK{{g, 0 ) : Cl = A Cl = h<^gl}{m), 
1/5.1 = SPK{l 4 : Cl = g^^^ A P2 = ~g^^" }(m), 









Unlinkable Divisible Electronic Cash 



129 



1/5.2 = SPK{i^ : F2 = 9 % f\F^= 
l/ 5.«-2 = SPK{Lu- 2 ■ Pu^2 = g'pl-l A 

h^u-1 

V5,u-1 = SPK{l^^i : F„_1 = A ^’}(m). 



Finally, the customer sends the shop A = {jij2 ■ ■ • ju, Fj^...j^,gn, gp^, ■ • ■, 
9pi,-ij Fi, F2, ...,Fu~i, C\,C2, Vi,V 2,^3, F4, V^^i, . . ., Vs^u-i) as the 
payment. 

2. The shop verifies that A is correctly formed. If the shop is successful, this 
payment is permitted. 



4.4 Deposit 

In the deposit protocol, over-spending is checked on the divisibility rule shown 
in Section 3.1. If over-spending does not occur, the paid amount is deposited 
in the account of the shop. Otherwise, the over-spender can be identified by 
the below owner tracing protocol. When the node is used for the pay- 

ment, the transcript of the payment includes F value Fj^...j^. If the same node 
is used, the sameness of F value indicates over-spending. If the nodes 
and (u < u') with the parent-child relationship are used which also 

means over-spending, the corresponding and Fj^...j^, have relations as 

Fn-U 4 i = ^ of interme- 
diate nodes, . . . , Fj^...j^,_^. Thus, the relation enables the bank to de- 

tect over-spending. The following is the detailed protocol to deposit the payment 
of the node For the payment of multiple nodes, this protocol is executed 

multiple times. 



1. The shop sends the bank the transcript of the payment A. 

2. The bank verifies that the transcript is correctly formed. Then, the bank 

checks whether the payment causes the node to be over-spent as 

follows: 

a) If u > 2, for all databases (1 < f < u — 1) and all F e 

the bank computes F+i = ■ ■ ■? Fi = checks 



F — F- 

b) For the database and all F„ G the bank checks Ft = 

F- 



c) If u < £ — 1, for all i (u -I- 1 < t < £), all j«+i, ■ ■ ■ ,ji G {0, 1}, all 
databases and all F S the bank computes 

Ft-i-i = , ..., F = where Ft = Fi-jtt. and checks 
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When any check is successful, the paid node is over-spent. Then, the over- 
spender can be identified by the owner tracing protocol. Otherwise, the amo- 
unt of the node is deposited in the shop’s account, and is kept in the 

bank’s databases while the transcript A is also kept since it can be 

used as the witness if over-spending occurs in the future. 



4.5 Anonymity Revocation 

When a judge’s order of the anonymity revocation is given, the following owner 
or coin tracing protocols for a targeted payment or withdrawal is executed, res- 
pectively. Furthermore, when over-spending is detected in the deposit protocol, 
owner tracing for the over-spent payment is executed. The owner tracing proto- 
col is the same as the identification of the signer in the original group signature 
scheme. The coin tracing protocol is arranged from that of the e-coupon system. 

Owner tracing: 

1. The bank sends the trustee a transcript of the targeted payment A, which 
includes Ci and C 2 . 

2. The trustee verifies that the transcript is correctly formed. If it is cor- 
rectly formed, the trustee sends the bank z = C 1 /C 2 and SPK{a : 
Cl = 202°" f\h = y^}(0). This SPK proves that (Ci,C 2 ) is decrypted 
into z. 

3. The bank searches 2 : identical with 2 to present the customer’s signature 
on 2 , which indicates the payer of A. 

Coin tracing: 

1. The bank sends the trustee the transcript of the targeted withdrawal 

{y,z,Ci,C 2 , Vi,V 2 ,V 3 ). 

2. The trustee verifies that the transcript is correctly formed. If it is cor- 
rectly formed, the trustee sends the bank h = CijC^^^ and SPK{a : 
Cl = hC 2 Ah = y^}(6). This SPK proves that (Ci,C 2 ) is decrypted 
into h, which should equals . 

3. For the sent h, the bank (and shops) checks the following for F value, 

sent during payment of the node (1 < u < £): 

Case of u = 1: They checks Fjy = h. 

Case of u = 2: They checks Fj^j^ = j 2 )- 

Cases oi 3 < u < £: They computes F 2 = h ^2 j^)^- ■ ■ ^u-i = j 1 )’ 

where Fi = h, to check Fj^...j^ V 

If any check is successful, the transcript is derived from the targeted 
withdrawal. 



5 Discussion 

The e-cash system proposed in this paper as well as the original e-coupon system 
and group signature scheme is based on the infeasibility to compute and compare 
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the discrete logarithms, the security of the ElGamal encryption [ 12 ] and blind 
RSA signature [ 13 ], and the infeasibility to compute (x, v) satisfying fix^^ +/2 = 
(mod n). 

It is discussed that the proposed system satisfies the requirements in Sec- 
tion 2 . 

Unforgeability: From the infeasibility to compute (x, v) satisfying /ix®^ -I-/2 = 
(mod n), it is infeasible to forge a coin. From the soundness of the 
SPK^s, it is infeasible to compute the transcript of the payment without a 
coin. 

No over-spending: The SPK's during payment assure that F value of the 
paid node is correct. Assume that a customer over-spends a coin. If the 
customer spends the same node twice or more, over-spending is detected 
in Step 2 (b) of the deposit protocol owing the sameness of F value. If 
the customer spends nodes and {\ < u < u' < t) with 

the parent-child relationship, the corresponding and have 

relations as A,-,..., ,, = for F values of 

the intermediate nodes, . . . , Fj^...j^,__^. Thus, in Step 2 (a) or (c) of 

the deposit protocol, over-spending is detected. Since the SPK's also assure 
that (Cl, C2) is the FlGamal encryption of x, the bank cooperating with the 
trustee can identify the over-spender in the owner tracing protocol. 

No swindling: The blind signature prevents anyone except a customer who 
withdrew a coin (x, v) from obtaining the coin. Because of the secrecy of the 
SPK and the infeasibility to compute the discrete logarithm, no other party 
can obtain (x,x) from a transcript of a payment. Thus, no other party can 
spend the coin of a valid customer. The deposit information is a transcript of 
a payment. Since the transcript is unforgeable and no other party can spend 
the coin of a valid customer, the deposit information cannot be forged. 
Anonymity: In the confirmation of the anonymity and unlinkability, the pay- 
ments of the case of 3 < u < £ are only discussed, since the other cases can 
be discussed similarly. To identify the payer, it is required to decide whether 
y which is used to compute the withdrawal (y, x, Ci, C2, Ui, U2> Gs) and y 
which is used to compute the payment (jij2 • • ' ju, Fji---j^,9n,gp2J ■ ■ ■ 

Fi,F2, ...,Fu-i, Ci,C2, Vi,V2,V3, Vi, ^^ 5 . 1 , &re the same. In 

both transactions, since (C'i,C2) and (Ci,C2) are the FlGamal encryptions 
and 1 / is a blinded message on the blind RSA signature, they reveal no in- 
formation about y. Furthermore, Fi, U2, ^3, l^i, ^2, F4, Fsy, . . ., Fs.m-i 
are SPK’s, and thus they also reveal no information. Therefore, the possibly 
available values are 2; together with its public base gn in the withdrawal, 
and the revealed F value Fj^...j^ and the committed F values Fi, . . . , Fu-i 
together with their public bases h, h(2,o), ■ ■ ■ , h(e,o), ^(2,1), ■ ■ ■ , ^(£,1) and the 
random bases gn,gp2, • ■ • . 3 p„_i in the payment. When the revealed F value 
is used, the above decision is performed by deciding whether log^^ x and 
log^(log;j^^ ^^^ (• • • (log,j^^ ^ Fji---j^) ■ ■ ■)) are the same. However, the latter 
decision is infeasible owing to the following proof: 
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Assume on the contrary that a probabilistic polynomial time algorithm M 
decides whether log^^ x and log;^(log;,j^ (• • • (log^^^ Fji -jJ ' ' •)) are the 
same with a non-negligible probability. Then, the following probabilistic po- 
lynomial time algorithm M with the inputs hi , h'l ,zi,z[ G G„ can be con- 
structed: 

First, M chooses /12 &r Gp^,. ■ ■ , hu &r Gp^. Next, from them and the input 
z'l, M computes Zi = Z3 = /ig^, . . . , . Finally, M runs M with 

the inputs g„ = hi, z = zi, h = h'l, /i(2j2) = ^2,- • ■ ) and 

Then, since log^^ 2; = log;,^ and log^(log;,j^ (• • • (log/i^^ •••)) = 

log^/^ z'l, M can decide whether log/j^ and log^/ z\ are the same with the 
non-negligible probability. This contradicts the infeasibility to decide the sa- 
meness of discrete logarithms. Thus, the decision of logg„ 2: = log^^ (log/j^^ ( 
• • • (log/i(^ . ) Fj^^..j^) ■ ■ •)) is also infeasible. This proof also holds on the ca- 
ses of the committed F values Fi,...,F„_i. Therefore, the anonymity is 
assured. 



Unlinkability: To link two payments (jij2 • • • Jm, , gn,Sp2 j ■ ■ ■ j . 9 p„_ 

F2, . . . , Fu-i, Ci,C2, fei, V2, Vs, n, fes.i, . . fe 5 .«-i) andO'lj' • • -f^,, F'., 
g'n, .9fe , • • • , , F[, PI 2 , ..., C'i,C^, Vi, Vi, Vi, Vi, V^i, . . ., % 



■ ■ 1 .9p„-i j Fi, 

T, F', , 



for u < u', it is required to decide whether y which used to compute them are 
the same. Since the use of the same y implies that F values of the same nodes 
are the same, the link is also performed by deciding whether F values of the 
same nodes are the same. The possibly available values are the reveled F 
values and the committed F values together with the bases used to compute 
them, as mentioned in the anonymity. 

If the paid nodes ...^^ and -i' , have the parent-child relationship which 
it means over-spending, the payments are linkable as shown in the confirma- 
tion of no over-spending. Otherwise, ji = j[, . . ., = j' and yf j'v+i 

for some v < u. Then, the common youngest ancestor node of the paid nodes 



is When the reveled F values Fj^...j^ and F'-, ^ in the payments 

are used to link them, the link is reduced to decide whether 



and F'-, 









Fji---jP) 


...)) 








F'-, ; 


,)■••)) 


holds, which means to decide whether Fj^...j^ 


and F'-, 
H- 


• it 



decision is infeasible by the similar proof shown in the anonymity. This proof 
also holds on the cases of the decision between the committed F values, and 
the decision between the F value and the committed F value. Therefore, the 
unlinkability is assured. 

Anonymity revocation: 

Owner tracing: Since the SPK's assure that {Ci, C2) is the ElGamal en- 



cryption of z, the bank cooperating with the trustee can identify the 
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payer from the targeted payment in the owner tracing protocol, where 
(Cl, C 2 ) in the other payments are not decrypted and thus the payments 
remain anonymous. 

Coin tracing: The SPK’s during the withdrawal assure that (Ci,C 2 ) is 
the ElGamal encryption of fp , and the SPK's during the payment as- 

F ... 

sure Fj^ = , . . . , Fj^...j^ = ■ Thus, the bank and shops coo- 

perating with the trustee can trace the transcripts of the payments with 
Fji = as shown in the coin tracing protocol. Since (Ci,C 2 ) in the 
other withdrawals are not decrypted and thus the other payments re- 
main anonymous. 

From the description of the protocols, it is shown straightforwardly that the 
divisibility and off-line-ness hold. 

Next, the efficiency of the proposed system for N, which is the divisibility 
precision, is discussed. The setup and withdrawal protocols are conducted in 
0(log A^) and 0(1), respectively. To pay any amount of a coin, 0{\ogN) nodes 
can be used. The protocols after the payment are conducted in O(logAf) per a 
node. Thus these protocols are conducted in 0{{\ogN)‘^). 

6 Conclusion 

In this paper, a divisible e-cash system has been proposed, where (1) the unlin- 
kability among all payments holds, and (2) the computational complexity of all 
protocols is 0((log A^)^). Since a type of SPK (i.e., the proof that a discrete lo- 
garithm and a double discrete logarithm are the same) utilizes a cut-and-choose 
method, the proposed system is less efficient than systems in [2,3]. Therefore, 
an open problem is to propose the efficient unlinkable divisible e-cash system 
where any cut-and-choose method is not used. In addition, the security of our 
system is based on the heuristic assumption as the infeasibility to compute {x, v) 
satisfying +/2 = (mod n). Thus, another open problem is to propose 
the system where the security is proved based on the cryptographic assumptions 
theoretically clarified. 
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Abstract. An one-way hash chain generated by the iterative use of a 
one-way hash function on a secret value has recently been widely em- 
ployed to develop many practical cryptographic solutions, especially el- 
ectronic micropayment schemes. In this paper, we propose a new concept 
called a weighted one-way hash chain. We then proceed to use the new 
concept to improve in a significant way the performance of micropayment 
schemes. We also show that the proposed technique is especially useful in 
implementing micropayment on a resource restrained computing device 
such as a hand-held computer. 

Keywords: Cryptography, Electronic commerce. Micropayment, One- 
way hash chain. Portable computing device. 



1 Introduction 

Internet micropayment schemes have received growing attention recently, largely 
due to the fact that these schemes exhibit the potential of being embedded in 
numerous Internet based applications. As a special type of electronic payments 
[1] , micropayment schemes allow a customer to transfer to a merchant a sequence 
of small amount payments over the computer network in exchange for services 
or electronic products from the merchant. With these services or products, often 
it is not quite appropriate to pay the total amount of money either in advance or 
afterwards. This is particularly true in certain cases where real time bargaining 
results in the requirement of a small payment being received and verified by the 
merchant. Possible practical applications of the above micropayment model in- 
clude digital newspaper [2], on-line journal subscription, on-line database query, 
multimedia entertainment over the Internet, and Internet advertisement (say via 
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lottery tickets [3]). More examples can be found in [3,4]. In addition, accounting 
and pricing for Internet services and mobile telecommunication may represent 
yet another set of promising applications of micropayments [5, 6, 7, 8, 9]. 

The most notable representatives of micropayment schemes include those 
proposed in [3,4,10,11,12,13,14], The fundamental cryptographic tool for most 
of these payment systems is a one-way hash chain which has been known widely 
by researchers ever since Lamport first proposed its use in one-time passwords 
[15,16]. One-way hash chains have also been extensively employed in the deve- 
lopment of a special class of high-speed signature schemes called the one-time 
signature schemes [17,18,19,20,21]. As this class of signature schemes use only 
one-way functions, they can be very fast by the use of efficient cryptographic 
hash functions, rather than less efficient trap-door one-way functions [22]. 

Some notations and symbols about a one-way hash chain are reviewed in the 
following. 

Notation 1 When a function h is iteratively applied r times to an argument 
Xn, the result will be denoted as that is 

h"'{xn) = h{h{-j ■ {h {xn)) ■ ■ •))• 
r times 

When the function h{) in the iteration is instantiated with a one-way hash 
function, such as MD5 [23], SHA [24], and HAVAL [25], the result is a one-way 
hash chain as shown in Fig. 1. Note that within the chain, each element Xi is 
computed as 

Xo — h {Xfi) ^ Xl — h (Xn) ^ ‘ ‘ ‘ ^ Xn — 1 — h (Xn) ^ Xn 

Fig. 1. One-way hash chain. 



2 Review of the PayWord Micropayment Scheme 

The PayWord micropayment scheme [4] which is mainly based on the idea of 
using a one-way hash chain, will be particularly illustrative in explaining our 
ideas on weighted one-way hash chains. In this section we briefly review the 
scheme. 

Prior to the first transaction taking place between a customer and a mer- 
chant, the following preparatory steps need to be carried out. 

(1) The customer generates a payment chain as follows: 

Xq i — Xi i — X 2 ^ — ■ ■ ' ^ i — Xji 

where x, = for i = n — 1, n — 2, ■ • • , 1, 0, and h{) is a cryptographic 

one-way hash function. The value x„ is a secret value selected at random by 
the customer. 
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(2) The customer signs, e.g., using RSA [26], on the root xq, together with the 
merchant’s identity and other pieces of information (if required): 

Signc (Merchant-ID jjxo || Cert) 

where “Cert” which is used as a proof of credentials, is a digital certihcate 
issued to the customer by a bank. Note that the signature on xq acts as a 
commitment. 

(3) The customer then sends 

Signc (Merchant-ID || xq || Cert) , Merchant-ID, Cert, xq 
to the merchant. 

After completing successfully the above steps between the customer and the 
particular merchant, the number Xi (z = 1, 2, . . . , n) can now be used as the zth 
coin to be paid. When receiving a new coin x, form the customer, the merchant 
verifies whether Xj_i = h{xi). The merchant accepts Xi as a valid payment only 
if the verification is successful. Note that the merchant can store a valid Xi in 
place of Xj_i. 

3 Related Work on Improving Micropayment Schemes 

In our view, a good micropayment scheme should be designed in such a way 
to meet the requirements of minimizing the computational, storage, and ad- 
ministrative costs for interactions with the bank. Indeed, many micropayment 
schemes that have been developed by various researchers share a simple structure 
and acceptable performance with the one-way hash chain described above. 

It should also be pointed out that ease of implementation is another fun- 
damental requirement of a payment system, especially in such applications as 
portable micropayment systems and mobile telecommunication charging schemes 
as mentioned earlier. These environments generally involve the use of portable 
computing devices which often have limited computing resources, e.g., a small 
amount of memory space, a relatively slow CPU, and a short life span of batte- 
ries. 

In [7] , an experimental portable micropayment system based on Pay Word [4] 
has been reported. From the discussions given in the paper, it becomes clear that 
for a general purpose portable device, while a small or moderately large value of 
n (call the length of the payment chain) would be acceptable, a larger n can cause 
unacceptably lengthy delay in computation. On the other hand, a larger value 
of n reduces the required amount of computation for public key based signature 
which is actually the essence of developing PayWord-like micropayment schemes. 

To solve the above problem with contradicting requirements, a straightfor- 
ward alternative of using a larger value of n was suggested in [4,7]. The al- 
ternative method allows one to construct different payment chains for different 
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denominations. This method, however, introduces a new problem in that it com- 
plicates the task of implementation and operations (also stated in [7]). The al- 
ternative method also requires much more memory space to store the x„’s (the 
random secret) for all the payment chains, and to remember the index of the last 
spent coin of each payment chain. Thus, the alternative method does not really 
provide satisfactory solutions from the viewpoint of portable devices. Further- 
more, we also note that the method complicates the operation of the merchant 
and requires the merchant to store the last received coin of each chain from the 
customer. For all these difficulties in practice, proposals of this type will not be 
considered in the following discussions. 

We argue that the development of efficient structures different from a simple 
one-way hash chain is necessary for good micropayment schemes, especially for 
those to be implemented on a portable computing device. Currently, only very 
limited research results on this topic can be found in the literature. 

One of such attempts is a structure called Pay Tree proposed in [27]. While a 
Pay Tree does offer a solution for a multi-merchant environment where a Pay Tree 
can be spent among many different merchants, it has two drawbacks that dis- 
courage the use of Pay Tree in practice. The first drawback is that the customer 
needs to store all the leaf nodes (as independent random numbers) of a Pay- 
Tree. These leaf nodes represent electronic coins bought by the customer from 
a bank. In a practical application, as the number of leaf nodes could be large, 
the customer may need to prepare a large amount of memory space to store all 
the node values. The second drawback is that double spending of a coin in the 
Pay Tree scheme cannot be avoided, although it can be detected afterwards. The 
reason is simple: the same coin can be paid to many different merchants by the 
customer. This drawback complicates greatly the system’s operation. 

In a more recent development, the authors of [28] proposed a new tree-based 
structure called an unbalanced one-way binary tree (UOBT). A major difference 
between UOBT and Pay Tree is that UOBT facilitates merchant specific micro- 
payments in a simple way similar to the conventional one-way hash chain struc- 
ture. In a scheme based on UOBT, a secret random value is chosen as the root 
(very much like the in the one-way hash chain structure) . This secret value is 
used to construct a tree from the root towards the lower levels in an unbalanced 
binary tree, such that given a child node, no parent node can be derived from the 
child node. In [28], it was demonstrated that the UOBT approach can improve 
the performance of micropayment schemes significantly. An independent analy- 
sis recently carried out by Peirce [29] further confirm that the UOBT approach 
is superior to the conventional one-way chain not only from a theoretical point 
of view, but also an implementor’s point of view. 

The main contribution of this paper is to propose a new technique for im- 
proving the performance of micropayments based on one-way hash chains. The 
core of the new technique is to assign, in a randomized manner, a different de- 
nomination to each coin. An interesting feature of the new technique is that it 
complements UOBT [28], and can be combined with the UOBT approach using 
two different methods. The first method is straightforward, and it assists the fur- 




Weighted One-Way Hash Chain and Its Applications 139 



ther enhancement of the performance of UOBT. This is based on the observation 
that although the UOBT constructs a tree instead of a chain, a value in the tree 
still stands for a coin in the same way as a value in an ordinary one-way hash 
chain. The second method is of special interest in that it enables the spending 
of a single UOBT structure among several different merchants, whereby further 
enhancing the overall performance of a micropayment system. 



4 The Proposed Solution 

In a real world application involving a customer and a merchant, it would be 
quite possible that the customer is asked to pay several coins a time in order to 
obtain services from the merchant. Think of the case of browsing an interesting 
web site. Instead of paying fees page by page, it would make more sense for the 
customer to pay the overall costs prior to or after viewing all the relevant pages 
for a particular topic. 

Let us consider the case where the number of coins to be paid, say c, is 
assumed to be a random integer selected from [l,t]. Examining the micropaym- 
ent schemes (and their implementation) published previously in the literature, 
one can see that all these proposals have implicitly assumed that c is a random 
integer. Realizing this, some researchers suggest to use different chains for diffe- 
rent denominations [4,7]. While this would certainly improve the performance, 
the major aim of this paper is to examine ways on how to further improve, in 
a significant way, the performance of a PayWord-like micropayment scheme in 
which the customer is asked to pay more than one coin a time to the merchant. 
The starting point of our solutions is the use of so-called weighted one-way hash 
chains. We show how these weighted chains can be used to provide a far better 
alternative to improve the overall performance and implementation of a micro- 
payment system. 



4.1 The Weighted One-Way Hash Chain 

With the conventional one-way hash chain approach, typically each coin in a 
payment chain is assumed to be worth d cents. When Xj is the last spent coin 
and c more coins need to be paid to the merchant for a new payment transaction, 
the customer computes Xj^c = passes it over to the merchant 

as a new coin. The last spent coin is then updated to Xj+c, and j to j + c. Clearly, 
the value of each payment will be worth a multiple of d cents. The actual value 
of d can be determined at the outset between the customer and the merchant. 

In order to improve the performance of micropayment systems based on one- 
way hash chains, e.g., the PayWord scheme [4], the following new concept is 
introduced and its applications will be examined. 

Definition 1 A weighted one-way hash chain {xq, a-q, 0 : 2 , • • • , a;„}, where 

Xi = consists of a generic one-way hash chain together with a specific 

weighting value Wi being assigned to each Xi. 
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A possible weighting assignment mechanism is the self-encoding method, i.e., 
to let Wi = f{xi) where the function / can be any well defined mapping from a 
value Xi to a weight Wi. For a concrete example, when Wi needs to be a random 
variable over the function can be defined as 



Wi = (xi mod t) + 1. (1) 

When Xi ^ t, the function can be modified to 

Wi = ((xi mod 2^) mod t) -I- 1 (2) 

where b > [log 2 t'] . In Eq. (2), Wi can be computed with little effort by a binary 
arithmetic processor, especially when Xi ^ t and t is not a power of two. Note 
that Xi generated by a typical cryptographic one-way hash function will have a 
value close to However, if t is not a power of two, Wi obtained from Eq. (1) 
will be distributed more uniformly over [1, f]. As a real example, for t = 6, setting 
6 = 5 or 6 = 6 in Eq. (2) will result in an acceptable Wi that is more or less 
uniformly distributed over [1,6]. 

4.2 Micropayment with Varying Denomination 

Based on a weighted one-way hash chain as discussed above, the original Pay- 
Word micropayment scheme can be modified as follows. The payment chain 
generation process is still the same but the signature part needs some minor 
modifications. More specifically, we define 

5ignc(Merchant-ID| |xo ||Other-Info||Cert) 

where ‘Other-Info’ describes the parameters t and b and some other necessary 
information, e.g., d if it is different for each chain. During the payment process, 
the denomination of each coin is determined by its weighting value described in 
the previous sub-section. 

Suppose that in the ith payment transaction, the customer is required to pay 
the merchant c, -d cents (where c, is an integer between [1, t]), i.e., paying Ci coins 
in the one-way hash chain. Fig. 2 describes the payment procedure performed 
by the customer when the weighted one-way hash chain is employed. For each 
new payment which costs Ci ■ d cents, the customer computes a pair of integers 
{xj,e}. If e = 1, Xj is the coin value exactly next to the one received by the 
merchant in the previous transaction. If e > 1, more than one coin will be spent 
and Xj is the dominating one, i.e., all other coins are derived from Xj. 

An additional variable B (it stands for balance) can be defined to further 
facilitate the use of a weighted payment chain. The variable B is originally 
set to zero and will be modified during a payment transaction. The operations 
between Step (2.7) and Step (2.17) are optional. Note that if both d and Wj 
are random integers over [l,t], then the variable B will statistically remain in 
a limited range [—5, -fd] for most of the time. The parameter d depends mainly 
on the value of t. From this point of view, operations between Step (2.7) and 
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Step (2.17) are not really required. However, the customer may intentionally 
break the rule by spending c, larger than the computed Wj for every payment. 
This will make B much larger than an expected parameter 5 after the whole 
payment chain is spent, resulting in a situation that is not fair to the merchant. 
To avoid this possible cheating by the customer, operations between Step (2.7) 
and Step (2.17) compute one or more additional coins in order to keep B within 
the specified range. In Step (2.7), 6 = 2t is selected, primarily as an example. 



Initially: B 0; i 0 



For each new payment worth d -d cents: 

2.1 ^ j < n then 

2.2 j^j + 1 

2.3 

2.4 Wj ((xj mod 2^) mod t) -f 1 

2.5 B B + {wj — Ci) 

2.6 e ^ 1 

2.7 while B < —2t do 

2.8 if i < ^ then 

2.9 j^j + l 

2.10 Xj •(— h^~^{Xn) 

2.11 Wj -4— {{xj mod 2*’) mod t) -I- 1 

2.12 B B + Wj 

2.13 eo-e+1 

2.14 else 

2.15 (stop because there are not enough coins) 

2.16 endif 

2.17 endwhile 

2.18 else 

2.19 (stop because there are not enough coins) 

2.20 endif 

2.21 send Xj and e to the merchant 



Fig. 2. Customer’s payment procedure based on a weighted one-way hash chain. 



Fig. 3 describes the corresponding payment verification process performed 
by the merchant. The variable Xf is initially set to Xq. It is used to store the 
last received coin. Naturally, the coin will have to pass the verification process. 
For each newly received payment {xi,e}, the merchant checks whether or not 
Xg = h^{xi). The merchant also adjusts the variable B according to one or more 
computed weighting values w and checks whether or not B > —2t (let S be 2t). 
If the two verifications are both correct, the merchant stores Xg into Xg as the 
last received coin. After the successful completion of the present transaction, 
say the pth transaction, the merchant should have collected (X]j=i '*"i) ' cents 
from the customer. As in the payment procedure by the customer, the checking 
of whether Btemp > — 2t in Step (3.8) is optional. 
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Initially: B ■<— 0; ■<— xo 



For each received {xi,e} worth a-d cents: 

3.1 Y i Xi'^ Btemp ^ B 

3.2 for j from 1 to e 

3.3 w ■<— ((Y mod 2**) mod t) + 1 

3.4 Btemp ^ Btemp “ 1 “ 

3.5 Y ^ h{X) 

3.6 endf or 

3.T Btemp ^ Btemp Ci 

3.8 ^ X = Xi and Btemp > — 2t then 

3.9 (accept the payment) 

3.10 Xe •<— Xi 

3.11 B -h- Btemp 

3.12 else 

3.13 (reject the payment) 

3.14 endif 



Fig. 3. Merchant’s payment verification based on weighted one-way hash chain. 

4.3 Some Useful Special Weighting Assignment Algorithms 

For some practical micropayment applications, the value of Cj may not be uni- 
formly distributed over the range of e.g., small values or large values of Cj 
will occur more frequently. In the following, the weighting value wi (over 
of each coin xt is computed by using simple algorithms given in Fig. 4, where 
■ ■ ■ , Xi^i,Xifi )2 are the t — 1 least significant bits in the binary re- 
presentation of Xi- For the case of small value bound payments in which small 
values of c, will occur more frequently, the special weighting assignment mecha- 
nism in Fig. 4(a) can be employed. As an example, in Fig. 4(a), if t = 5 then the 
probabilities for wt to be 1, 2, 3, 4, and 5 are and respectively. 

On the other hand, for the case of large value bound payments, the weighting 
assignment mechanism in Fig. 4(b) can be useful. Consider the case of t = 5. 
Then the probabilities for wt to be 1, 2, 3, 4, and 5 are and 

respectively. 

One more problem to be raised is whether or not a special weighting as- 
signment mechanism is required for medium value bound payments (or more 
precisely as the normally distributed with mean at [|]). Interestingly, the assig- 
nment mechanisms provided in Eq. (1) or Eq. (2) more or less meet the above 
requirement, since the weighting wt computed via these two approaches will be 
a random integer, almost uniformly selected from the range of Therefore, 
statistically the variable B will still remain in a limited range [— d', d-d'] during 
the payment process. However, in this case, 5' may be larger than the value 5, 
especially when either Eq. (1) or Eq. (2) is employed for uniformly random pay- 
ments as shown in Fig. 2. As a better alternative, for near normally distributed 
value bound payments in which medium values of Cj are expected to occur more 
frequently, the special weighting assignment mechanism described in Fig. 4(c) 
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Input: ■ ■ -,Xi,i,Xifi)2 

Output : Wi 



4a.l Wi •(— 1; j ■<— t — 2 
4a.2 while Xi^j = 1 do 
4a.3 Wi Wi + 1 

4a.4 i t" i — 1 

4a. 5 endwhile 

(a) Algorithm for small value bound payments. 

Input: {xi,t-2,Xi^t-3, ■ ■ -,Xi,i,Xifi )2 

Output : Wi 



4b. 1 Wi ^ — t] j ^ — t — 2 
4b. 2 while Xij = 1 do 
4b. 3 Wi <— Wi — 1 

4b. 4 4 i — 1 

4b. 5 endwhile 

(b) Algorithm for large value bound payments. 

Input: {xi,t-2,Xi,t-3, ■ ■ -,Xi,i,Xifi)2 

Output : Wi 



4c.l Wi [f]; 4 t - 2; fc 1; p 0 

4c. 2 while Xij = 1 do 

4c. 3 Wi Wi + (—1)*’ • k 

4c. 4 4t— 4 — 1; A;-(— + p-fl 

4c. 5 endwhile 

(c) Algorithm for near normally distributed value bound payments. 

Fig. 4. Proposed special weighting assignment algorithms. 

can be employed. Consider the case of t = 5. Then the probabilities for Wi to be 
1, 2, 3, 4, and 5 are and respectively. 

Of course, the actual distribution of Cj over the range [l,t] or equivalently 
the expected value of c, will be determined by a number of factors, such as the 
customer’s payment behavior, prices of products, or a combination of the above 
two. These, however, are out of the main concerns of this paper. 

Nevertheless, let us point out that in typical micropayment applications, the 
parameter t should not be a very large integer, e.g., 10 < t < 20 may represent 
a reasonable range of values. Therefore, generally speaking, there is perhaps 
little need to consider more complex weighting assignment mechanisms than the 
simple ones discussed above. In a rare situation where a large payment (i.e., 
Ci > t) needs to be executed, the customer can simply send more than one 
coin to the merchant, just like in the conventional one-way hash payment chain 
approach. 
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5 Analysis of Performance 

In this section, a comparison of the performance of various possible micropay- 
ments based on the weighted and the conventional one-way hash chains will be 
carried out. To simplify the analysis, it will be assumed that the value of Cj is 
uniformly distributed over the range of [l,t] and e = 1 for every transaction. 

Lemma 1. With a conventional one-way hash chain (with length n), if each 
payment transaction is worth Ci coins (i.e., Ci ■ d cents) where Ci [^,t], then 
the expected computational cost for each d cent transaction is approximately 
nj(i -I- 1) evaluations of a one-way hash function. 

Proof. The expected total number of hashing operations required to evaluate all 
the coins Ci within a payment chain is 



E 

i = 1 

Ci €r [l,t] 




t -{- 1 



where Cm is the last spent coin of the chain. Therefore, the expected computa- 
tional cost for every d cent transaction is roughly about T/n = n/{t-\-l) hashing 
operations. □ 



Lemma 2. With a weighted one-way hash chain (with length n), if each payment 
transaction is worth Ci-d cents, where Ci [1, t], then the expected computational 
cost for each d cent transaction is approximately n/(t -|- 1) evaluations of a one- 
way hash function. 

Proof. The expected number of all the cents embedded in the chain is 

{t-\-l) ■ d ■ n 
D = 2^{ci ■ d) = . 

i=l 

Whereas the total number of hashing operations required to evaluate all the 
coins Xi (i = 1, 2, • • • , n) is 



rj, n-{n-l) 

T = 2^{n-i) = ^ . 

Therefore, the expected computational cost for each d cent transaction is = 
(n — l)/{t -I- 1) PS n/{t -\- 1) evaluations of hashing. Note that we have taken into 
account the fact that n :§> tin most applications. □ 

Note that for transactions worth Cj • d cents (cj [1, t]) in a payment chain 
of length n. Lemma 1 and Lemma 2 indicate that on average, the conventional 
one-way hash chain and the proposed weighted one-way hash chain perform 




Weighted One-Way Hash Chain and Its Applications 145 



equally well for paying d cents (i.e., the defined smallest amount of payment for 
the specific micropayment scheme). 

However, this does not imply that both techniques perform equally efficiently 
for all micropayment applications. While a conventional one-way hash chain can 
be used to embed n ■ d cents, a weighted chain with length n can on average 
represent ■ n) ■ d cents. We note that if the parameter t is equal to one, i.e., 
being an equally weighted chain with all Wi = 1, the weighted chain is reduced 
to a conventional chain. 

Recall that with the conventional one-way hash chain approach, after spen- 
ding all the n coins, it requires the customer to generate a new payment chain and 
to create a new public key based signature for the chain. When this happens too 
often, the overhead of the micropayment scheme will be increased significantly. 
At the first glance, the selection of a large value n may require the customer to 
sign chains less frequently. In fact, this does not completely solve the problem, 
simply because a large n will result in the requirement of spending more time in 
evaluating each single coin. As was mentioned earlier, the first good solution to 
the problem is to use an unbalanced one-way binary tree [28]. A major benefit 
of such an unbalanced tree is that it requires less frequent generations of public 
key based signatures and a smaller amount of computation time for each coin. 
In what follows, we analyze in detail how and why the use of weighting over each 
coin represents yet another important contribution to solving the problem. 

Lemma 3. Suppose that {^^-n)-d cents will be embedded into a single chain and 
each payment transaction will be worth Ci ■ d cents where Ci [l,ij- Then, the 
conventional chain approach will take times of computation when compared 
with the approach of employing a weighted one-way hash chain. 

Proof. In order to embed ■ n) ■ d cents into a single conventional chain, it 
requires the chain length to be enlarged to n' = • n. However, from the result 

of Lemma 1, this will result in 



n' t -I- 1 n n 

TTi ^ ' ITi ^ 2 



( 3 ) 



expected number of hashing evaluations for each d cent transaction. This repre- 
sents ^ times of the computational cost required by a weighted one-way hash 
chain. □ 



Clearly, by reducing the computational complexity from ^ (see Eq. (3)) down 
to t > 1, the proposed weighted one-way hash chain technique will signifi- 
cantly improve the performance of micropayment systems, by an order of Oft). 

We have the following more general result that indicates in greater detail 
performance improvement achievable by the use of a weighted one-way hash 
chain. 



Lemma 4. Assume that there will be an equal amount of money to be embedded 
into two one-way hash chains, one conventional and the other weighted. Further 
assume that each payment transaction will be worth Ci ■ d cents, where Ci is a 
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random (but not necessarily uniformly distributed) integer selected from [l,f] 
and with an expected value of E. Then, the conventional one-way hash chain 
approach will take E times of computation when compared with the weighted 
chain approach. 

To summarize the above discussions, by using a weighted one-way hash chain, 
the computational cost of the customer can be reduced to about ^ times of that 
required by a conventional one-way hash chain. It is equally important to note 
that the same level of improvement is achieved on the merchant’s side. 

As discusses earlier, weighted one-way hash chains and unbalanced one-way 
binary trees (UOBTs) introduced in [28] represent two complementary methods 
for improving the performance of micropayment systems. These two techniques 
can be combined to provide even a greater number of efficient solutions. To close 
this section, we note that 

(1) The UOBT technique improves the customer’s performance in the order of 
0{y/n). However, it does not improve the merchant’s performance and in 
fact it requires the merchant to store 0(i/n) more temporary values (for 
details see [28]). 

(2) The proposed weighted one-way hash chain technique improves both the 
customer’s and the merchant’s performances in the order of Off). 

(3) Both the weighted one-way hash chain and the UOBT are suitable for im- 
plementing on portable computing devices. 

6 Conclusions and Future Works 

We have proposed the novel concept of a weighted one-way hash chain, and pro- 
ven that the chain is very useful in the design and implementation of electronic 
micropayments, especially for those systems to be used with portable devices. 
An important characteristic of the proposed weighted one-way hash chain is 
that it improves not only the performance of the customer but also that of the 
merchant. 

It is interesting to consider the role of weight assignment on each computed 
hash value. If weighting is defined to encode the coin denomination, then a micro- 
payment scheme with varying denomination is obtained. On the other hand, if 
weighting is defined to encode the identity of merchant, then a micropayment 
scheme for multiple merchants can be readily constructed. This is particularly 
true when each hash value within the UOBT is mapped to a predefined set of 
merchant identities. 

More interestingly, if each hash value of the UOBT is given two weighting 
values via two separate weighting assignment functions, one being used to encode 
a merchant’s identity and the other to encode the coin denomination, then a 
micropayment scheme for multiple merchants with varying denominations can 
be obtained. 

Examining other possible weight assignment methods for various applications 
other than micropayments is an interesting open research topic. 
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Abstract. Designing a practical and complete electronic cash scheme 
has proved difficult. Designs must seek to optimise often conflicting met- 
rics such as efficiency, anonymity, the ability to make exact payments. 
Gains in one area often result in a loss in one or more other areas. Sev- 
eral schemes have accepted linkability of some payments as a concession 
to getting the balance right. A point that has not been highlighted is 
the problem of preventing linking between payments made with differ- 
ent linkable coins. This paper reviews several electronic cash schemes 
which have the linkability property and concludes that linking across 
coins is of significant practical concern. Design improvements are sug- 
gested along with observations regarding the user’s active role in pre- 
serving anonymity. 



1 Introduction 

A definitive and enduring statement as to what constitutes electronic cash is 
elusive. Most definitions of electronic cash seek to distill the essential properties 
of traditional physical cash. Definitions typically include properties such as: 

— resistance to forgery; 

— protection of the customers’ privacy or anonymity; 

— ease of transportation and spending. 

Since Chaum introduced blind signatures [8], provision of customer 
anonymity during payment has been a focal point for electronic cash scheme 
design. The ability to prevent or at least contain fraud is also central. The two 
objectives (customer anonymity and scheme security) seem to be opposed. The 
scheme operator (the bank) gains a higher level of assurance that fraud is con- 
tained if all transaction details are available for scrutiny. The customer gains 
maximum privacy if no details of any transactions are revealed. The objective 
has been to strike a balance such that both the scheme operator and customers 
are satisfied. 

One of the properties of physical cash is transferability. Coins and notes are 
easily transferred from one entity to another without the need for any other par- 
ties to be involved. With electronic cash, this service is desirable from the user’s 
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point of view but the scheme operator may find this unacceptable because it 
introduces serious challenges to effective scheme security. The bank’s accounting 
process is greatly assisted by requiring coins to be returned to the bank rather 
than circulating endlessly. Otherwise the introduction of forged coins may go un- 
detected. Also, law enforcement and taxation agencies may be less than happy 
at the prospect of large amounts of electronic cash circulating without any op- 
portunity for auditing. 

In addition to security and anonymity, other forces influence electronic cash 
scheme design. The majority of electronic cash schemes proposed in the last ten 
years have been built on the assumption that off-line payment transactions are 
necessary if the scheme is to achieve low transaction costs and high availabil- 
ity. More recently, this assumption has been challenged in light of advances in 
communication technology (both coverage and cost). 

The off-line paradigm has resulted in an emphasis on smart card based 
schemes because of the portability of such devices and the relative tamper- 
resistance provided. Advances in smart card technology have seen the devel- 
opment of devices with increased storage and greater processing power - all 
at reduced cost. Unfortunately, with respect to tamper-resistance, the race be- 
tween developers and those seeking to expose design flaws [2,1] has resulted in 
decreased confidence that such devices will ever be completely tamper-proof. 

In payment transactions involving traditional physical cash, it may happen 
that the customer does not have coins and notes that total to the exact amount 
of the transaction. Customers have the expectation that the merchant will be 
able to give change in this case. This effectively transfers the problem from the 
customer to the merchant. An electronic cash system is not practical unless a 
user can efficiently make exact payments. 

Previously proposed electronic cash schemes using smart cards have worked 
within the limitations of the smart cards available when designing the scheme. 
Storing, processing and transmitting large numbers of low-valued atomic elec- 
tronic coins has been impractical. In the future, this may change. For the mo- 
ment, these limitations have to be considered. As a result, the ability to make 
exact payments has been of fundamental practical importance. Coin divisibility 
of one form or another has been proposed as the practical solution. 

Divisible coin schemes have not been able to provide unlinkable payments. 
The unlinkability property is achieved if one cannot determine if two payments 
came from the same account. Linkable payments can have substantial implica- 
tions for a user’s anonymity. If a user’s coins are linkable, the user’s identity may 
be revealed by non-cryptographic means, such as correlating payment locality, 
time, size, type, frequency, or by finding a single payment in which the user 
willingly gave out their identity. 

For anonymous on-line schemes, the user can contact the bank on-line in order 
to obtain the exact change. A collection of anonymous coins can be exchanged 
on-line for a new collection of anonymous coins of differing denominations. In 
this way, the appropriate set of coins can be assembled by the customer and 
the payment transaction can proceed. Such a scheme was proposed by Brickell, 
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Gemmell and Kravitz [5] . As previously explained, on-line schemes have not been 
in vogue. 

Anonymous non-transferable off-line electronic cash schemes are unable to 
provide electronic change in a payment transaction. Dispensing off-line change 
would require the customer to identify themselves to the bank when depositing 
the change. This would reveal the user’s association with the original payment. 

This paper concentrates on the interplay between the design objectives of 
efficiency and anonymity. In particular, we regard the following as the main 
contribution of this research: 

— identifying the hitherto unrecognised difficulty in designing an efficient elec- 
tronic cash scheme which allows exact payment while maintaining effective 
user control over the linkability of payments; 

— illustrating that these problems exist in several schemes proposed in the 
literature; 

— suggesting improvements to mitigate the severity of the problem. 



Organisation. The next section (Section 2) points out some of the practical 
issues that influence effective user anonymity. Section 3 gives an overview of the 
scheme proposed by Radu, Govaerts and Vandewalle [16]. Problems with this 
scheme are discussed in Section 4. Section 5 highlights the presence of the same 
type of problem in other schemes. Possible solutions are explored in Section 6. 

2 Anonymity 

Widely known cash schemes such as that of Ghaum and its derivatives are proven 
to provide strong anonymity in the sense that it is not possible to correlate coin 
withdrawals with payments. In this section, we point out that it is essential to 
consider the overall factors that impact on a user’s effective level of anonymity. 
Some of these factors are as follows. 



Be one of many: For an individual to be anonymous there needs to be a 
crowd in which to hide. A user who is the only participant in an anonymous 
electronic cash scheme cannot conduct anonymous transactions. The number of 
participants and the volume of transactions (both withdrawals and payments) 
are important factors in attaining an acceptable level of anonymity. 



Be unpredictable: The individual’s activities need to be random in both space 
and time. For example, a user may be one of many using the Internet (a large 
crowd), but if that user has exclusive use of a machine (and uses no other) the 
IP address reveals information about the user’s identity. The problem associated 
with IP addresses is well recognized. Proxies and mixes [10] have been proposed 
as solutions. Reiter and Rubin [17] have published a proposal called ‘Crowds’. 
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Predictable events in time are also a concern. If it is predetermined that a 
payment always occurs immediately after a withdrawal, information linking the 
anonymous payment with the non-anonymous withdrawal is revealed. This may 
happen if the customer is prone to making impulse purchases when not already 
in possession of sufficient funds to make the purchase. 

Do not stand out in the crowd: If the electronic cash scheme allows coins 
of any value to be withdrawn, a user who withdraws a coin for an amount 
that is unusual may leak information regarding their identity. A withdrawal 
for a coin of value $157.19 may be readily matched with a payment that uses 
a coin of the exact same value. Even if the choice of possible denominations 
is restricted, some denominations may be withdrawn less often. This suggests 
that the number of available denominations should be limited. Further, if a 
denomination of $1,000,000 is available, only customers belonging to a very elite 
circle are going to have the resources to fund such a withdrawal. It is therefore 
desirable to have a relatively small collection of possible denominations with 
each denomination more or less equally likely to be withdrawn. 

In the typical account-based schemes, some information is necessarily avail- 
able to the bank about its customers. The bank knows the identity of its cus- 
tomers, the amount held in each of its customers’ accounts, the value of each 
withdrawal, who made a specific withdrawal and when that withdrawal was 
made. The bank also has access to deposit records and so can ascertain the flow 
of funds through an account. Therefore, the bank knows which of its customers 
are ‘big’ spenders and which customers are of more modest habits. 

Practical considerations require coins to have an expiry date. If coins remain 
valid indefinitely, the database of previously spent coins that the bank main- 
tains would grow indefinitely. Search times of this database would be adversely 
affected. Having an expiry date implies that a coin is valid for some period after 
being withdrawn. The expiry date and validity period leak information about 
the time at which the coin was withdrawn. The bank knows when withdrawals 
take place. Knowing approximately when a particular coin was withdrawn allows 
the bank to narrow down the search for the owner of the coin to a smaller set of 
customers. If the time of withdrawal is too precisely determined, the customer’s 
identity may be uniquely determined. Ideally, the expiry date for the coin should 
not be revealed but instead the status of the expiration of the coin should be 
determined using a zero knowledge exchange^. 

Cover your tracks: Some purchases invite the customer to reveal their identity. 
Buying a (paper) book from an on-line bookstore over the Internet is somewhat 
pointless unless a delivery mechanism is available. Providing a name and address 
for delivery (and in case of dispute) renders other anonymity preserving measures 
ineffective. 

Large collections of strongly linked events increase the chance of identity 
revelation. If the customer identity is voluntarily provided during one event, 

^ This is similar to the Millionaire’s problem described by Yao [19]. 
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the identity is revealed (with high probability) for all linked events. Even if 
the customer identity is not knowingly revealed, the collection of linked events 
provides a profile that may reveal the customer identity (with non-negligible 
probability) . 

3 The Radu-Govaerts-Vandewalle (RGV) Scheme 

In this section we outline a scheme of Radu, Govaerts and Vandewalle [16] based 
on the ‘wallet with observer’ paradigm [7]. The scheme seeks to improve the 
efficiency of the withdrawal protocol by breaking a withdrawal up into three 
separate transactions. The idea is to execute the expensive part of the with- 
drawal infrequently. In the following section, we analyse this scheme and use it 
to illustrate our theme of the practical problems of providing anonymity when 
payments are linkable. Subsequently we show that these same problems are com- 
mon to several schemes. Below is a brief overview of the RGV scheme. 

3.1 Bank Setup 

In order to setup the system, the bank completes the following steps: 

— Ghoose primes p and q such that q\p — l and solving the discrete log problem 
is hard in the unique subgroup Gg C Z* of order q. 

— Select distinct generators g, 9^,92 C Gg. 

— Select a suitable hash function 77(-). 

— Select a private key S <Er Z* and calculate the corresponding public key 
P = gS. 

— Select a private key S\ Gk Z* and calculate the corresponding public key 

Pi = 9^^ ■ 

The bank publishes the system parameters (p, 9, 5, ffi, 52) ^(')) pJ^us the two pub- 
lic keys P and P\. The bank also establishes three databases: 

— accounts database which stores user related information, 

— exchanged coin database which stores previously used big coins, 

— transcripts database which records the transcripts of deposited payment 
transactions. 

3.2 Registration 

During registration, the user Ui (who possesses a computing device Cj) opens 
an account with the bank B and is issued with a tamper-resistant smart card 
Ti to be used in conjunction with C,. The smart card contains a private key St 
imbedded there by the bank. The user uses Ci to randomly choose a private key 
Sc and the two private keys are used as shares of the user private key sjj which 
has associated public key pu- Note that the user does not know su but can sign 
using this key with the assistance of %. From the public key, the bank forms an 
identity ttq for the user. This identity is signed by the bank. The identity and 
signature are stored in the accounts database of the bank. The signature on the 
identity is sent to the user. 
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3.3 Withdrawal 

The withdrawal transaction has three separate components - get-pseudonym, 
withdraw -big -Coin and exchange-big-Coin. The get-pseudonym transaction is the 
most time-consuming and the aim is to reduce the number of times it is per- 
formed. The other two transactions are relatively efficient. Performing these 
steps on a more regular basis is relatively inexpensive. The basic steps relating 
to withdrawal are: 

— get-pseudonym: The user creates a pseudonym tt and obtains the bank’s 
signature rbas{n) on tt which is a restrictive blind signature protocol as 
described by Brands [3,4]. 

— withdraw -big -Coin: The user anonymously withdraws a ‘big’ coin from the 
bank using an ordinary blind signature protocol. In the challenge phase, the 
user imbeds the pseudonym in the challenge. 

— exchange -big -Coin: The user exchanges the ‘big’ coin for a collection of ‘small’ 
coins which can then be spent The partial details covering the creation of 
‘small’ coins are given in Figure 1. The user must provide the bank with the 
‘big’ coin along with the pseudonym tt and rbasM in order that the bank 
can check the validity of the ‘big’ coin. The ‘small’ coins are represented 
by a public key pair {pk, sk) chosen by the user plus a certificate for this 
key pair signed by the bank. The certificates are obtained using a non-blind 
signature protocol with the pseudonym tt imbedded in challenge. A ‘small’ 
coin has the form 

[{T^,pk2), {t, sk2),as{TT,pk2)] 

where t is the private key related to the pseudonym tt (= tTq) and sk2 is a 
secret shared by Ui^s devices Ci and %■ 



3.4 Payment 

The payment transaction uses a shared signature scheme to allow the shop to 
gain a signature on a payment specification pay spec. The shared signature is 
generated on behalf of user Ui by the combined efforts of Ci and Ti- The payment 
specification contains information such as the identity of the shop, the value of 
the payment and the date. This results in the shop having a payment transcript 

(tt , , 0'S (tt , pfc2 ) ) , [pay spec, Stc {pay spec)) 

' V " 

coin 



where Stc{payspec) is the shared signature of Ci and Ti on the payment speci- 
fication. 
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pk 2 
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Store coin: 
{■K,pk2,as{-K,pk2)) 



Fig. 1. RGV ‘small’ coin withdrawal sub-protocol 



3.5 Deposit 

The deposit transaction involves the shop sending the payment transcript to 
the bank. The bank checks the ‘small’ coin’s authenticity and also checks the 
transcripts database to see if the coin has been previously deposited. Double 
depositing of a transcript by the shop can be detected by checking the shop’s 
challenge in the shared signature Stcipayspec). If the same shop challenge ap- 
pears in two shared signatures, the shop has double deposited. Double spending 
a coin by the user allows the bank to extract the user’s private key sjj and hence 
the user identity py. 



4 Problems with the RGV Scheme 

4.1 Linkability of Coins 

The RGV scheme provides what the designer’s termed ‘restricted privacy’. If two 
coins derived from the same pseudonym, tt, are used in two separate payments 
with payment specifications payspeci and pay.spec2 respectively, then the two 
payment transcripts are as follows: 

(tt, pk^, pk^)), {pay.spec^, S^^pay .spec^ )) 

(tt, pk'2, o-s(7T, pk'2)), {pay.spec^, S't^{pay .spec^ )) 

The two transcripts share common information, namely tt, and so are linked. In 
this way, all payment transcripts that use coins derived from the same pseudo- 
nym are linked. If coins derived from a given pseudonym are used extensively. 
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the chain of linked payments will become quite lengthy. It is noted by Radu et al. 
[16] that if the owner of a pseudonym is revealed by other means, then the chain 
of payments involving coins related to that pseudonym can be traced back to 
the user. A long chain of linked payments increases the likelihood that the user’s 
identity will be revealed either voluntarily during a purchase or by correlation 
of payment details. Over time, multiple pseudonyms will be required to manage 
these risks. It was in this sense that the scheme was deemed to provide ‘restricted 
privacy’. 

The coins described in this scheme are not divisible. It is likely that a single 
purchase will require multiple coins to be spent in order make up the desired 
purchase price. If coins derived from different pseudonyms, 7 T(i) and 7T(2), are used 
in the same purchase (described by pay spec), then the payment transcripts will 
be as follows: 

(7T(i), pk'^, as{TT(i), pk'^)), {payspec, S[^{payspec )) 

(7T(2), pk' 2 , pk' 2 )), {payspec, S'^^{payspec )) 

The two transcripts share common information, namely pay spec, and so this 
event links the two pseudonyms, 7 T(i) and 7T(2), to the same user. Thus, all past 
and future payments made using coins derived from either 7 T(i) or 7T(2) are linked 
from this point in time. 

To avoid this problem, payments should be made entirely from coins derived 
from the same pseudonym. It is of course possible that the user has sufficient 
total funds to make a payment but does not have an appropriate collection of 
coins all belonging to the same pseudonym. The policy of making a payment 
with a collection of coins derived solely from the same pseudonym restricts the 
user’s ability to make payments of arbitrary value less than or equal to the value 
of total funds available. This situation is mitigated by decreasing the number of 
active pseudonyms. Having a single active pseudonym is optimal in terms of the 
ability to make payments when sufficient funds are available. 

Assuming that pseudonyms are retired from time to time to avoid unrav- 
elling the chain of linked payments, the unspent coins which are linked to the 
retiring pseudonym must be dealt with. Waiting for an opportunity to use the 
last remaining coins derived from the retired pseudonyms may require a deal of 
patience. In fact, it may result in the accumulation of several sets of coins all 
derived from retired pseudonyms. 

If the coins derived from the retired pseudonyms are of substantial value, 
doing nothing may be financially untenable. If the coins are all of small value 
(shrapnel), there may be less financial incentive for the user to address the issue. 
However, although not expressly addressed by Radu et al. [16], coins will need 
to be expired in order to limit uncontrolled growth of the transcripts database. 
So, waiting indefinitely to spend the shrapnel is not an option. 

4.2 Timing Attacks 

The ‘small’ coins exchanged for the same ‘big’ coin are linked to same pseudonym 
but the identity of the owner of the pseudonym is protected by the anonymous 
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withdrawal of the pseudonym. Radu et al. [16] note that the exchange process, 
exchange-big_coin, should take place in a different session. If exchange-big_coin 
takes place in the same session as withdraw -big -coin, the identity of the user 
has already been revealed in order that funds could be debited from the user’s 
account. If the get-pseudonym protocol has been run previously in the session, 
the user’s identity, tto, is already known to the bank as this protocol requires 
the bank to perform calculations using ttq. Failure to run exchangc-big-coin in 
a separate session reveals the identity of the user for all coins (both past and 
present) derived from this pseudonym. This scheme relies on the user to conduct 
their transactions in the prescribed manner in order to preserve anonymity. 

Controlling user behaviour is problematic. If withdrawals take place at a 
specific geographical location such as an ATM, it is unreasonable to expect 
customers to make two visits in order to be subsequently able to make a purchase. 

5 Problems with Divisible Schemes 

The most often applied solution to the problem of making exact payments is to 
provide some method of dividing an electronic coin. If a coin is not divisible, the 
customer must withdraw a coin whenever a payment is required or withdraw 
many coins beforehand and store them for future payments in the hope that the 
right collection of denominations is available. In 1989, Okamoto and Ohta [12] 
first introduced the concept of divisible electronic coins in the form of untrace- 
able electronic coupon tickets. The scheme was based on hash chains similar to 
later proposals that have been put forward for micropayments. The mechanism 
allowed the value of one coin to be subdivided into parts of equal value. 

Below is a brief overview of the main techniques devised for achieving divisible 
off-line electronic cash. These schemes all have the characteristic of linkable 
payments for those payments which use portions of the same divisible coin. A 
point that has not been highlighted is the practical problem of preventing linking 
between payments made with different divisible coins. 

5.1 Binary Tree Based Divisible Electronic Cash 

In 1991, Okamoto and Ohta [13] proposed another scheme which used a binary 
tree representation to implement divisible coins. The scheme used the inefficient 
cut-and-choose methodology as this was the best known technique at the time. 
Electronic coins can be divided and spent off-line in any denomination up to 
their total value. If a user over-spends, i.e., if the user spends an amount higher 
than the coin’s total value, the user is identified with overwhelming probability. 

The divisibility service provided by the binary tree mechanism is imple- 
mented in the payment transaction. The payment transaction requires the cus- 
tomer to send the merchant both an electronic license, B (created at account 
creation) and the electronic bill, C (created during withdrawal and linked to 
B). As a result, all payments share a common piece of information, B and thus 
are linked together. As noted by Pfitzmann and Waidner [15], the more the 
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same license is used the more likely the user can be traced by other means (i.e., 
correlating payment locality, date, type, frequency, etc.). 

In Eng and Okamoto’s scheme [9], the more efficient alternative to ‘cut-and- 
choose’ coins (single-term coins) proposed by Brands [3] was combined with 
a binary tree technique to construct divisible coins. The payment transaction 
consists of two stages - coin authentication and denomination revelation. The 
coin authentication stage involves sending the merchant the same information (a 
signature from the bank specific for a given coin) each time a payment is made 
using a given divisible coin. Hence, this information can be used to link together 
payments involving the same coin. 

Okamoto [14] later presented what was described as the first practical divisi- 
ble untraceable cash scheme. The binary tree mechanism was used in conjunction 
with an electronic license mechanism. All protocols were of comparable efficiency 
when compared with the most efficient non-divisible off-line electronic cash sys- 
tems available, with the exception of the account establishment protocol. This 
protocol was by the far the most expensive. Hence, this scheme can be practical 
only if account establishment is performed infrequently (typically once) for each 
user. Account establishment creates a ‘license’ with which coins are withdrawn. 
The result of not performing account establishment at each withdrawal is that 
coins withdrawn using the same license can be linked. 

The system of Chan, Frankel and Tsiounis [6] is a development of Okamoto’s 
proposal [14]. The major advantage of this system is that the construction of the 
electronic license can be performed more efficiently. The bulk of the computation 
in Okamoto’s user account establishment protocol relates to the construction of 
this license. It becomes feasible to create an electronic license on a per coin basis 
and hence this functionality can be included in every coin withdrawal. There 
is no trade-off between the degree of unlinkability among coins and efficiency 
attained. While direct linking across coins is addressed, the linkability between 
portions of the same coin remains a problem. 

5.2 Brands’ Cash 

In 1993, Brands [3] published an efficient scheme for non-divisible coins which 
only required the construction of a single term (unlike the cut-and-choose 
method). The security of the scheme is based on the difficulty of solving the 
representation problem. The ideas encapsulated in this design have become a 
basic building block for many of the schemes that have been proposed since. 
Extensions to the basic scheme were described in the original paper. These ex- 
tensions included a realisation of electronic checks and a further refinement to 
divisible checks. The rational for investigating both extensions was the difficulty 
of making exact payments while managing a large number of fixed denomination 
coins. 

Brands [3] noted that there are some drawbacks to these extensions. For the 
basic non-divisible check extension, the problem of linking by complementary 
amounts was identified. The bank can obtain some information from the amount 
for which a refund is requested, since on its refund-list complementary amounts 
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appear. Hence, the bank can exclude many of the tuples on the refund-list, since 
they do not form the complementary amount to that requested for refund. 

For the divisible check, all payments made with the same check are linkable 
and no solution to this problem was offered. It was suggested that the user could 
request a refund for the unspent part of a divisible check. However, claiming 
a refund for unspent check portions would reveal the user’s identity as this is 
necessary in order to deposit the refund into the user’s account. 



5.3 Revokable and Versatile Electronic Money 

Jakobsson and Yung [11] describe a revokable electronic cash scheme. The main 
objective of this scheme is to allow revocation of a user’s anonymity in the event 
that sufficient evidence exists to indicate criminal behaviour. This is achieved 
through the introduction of an ombudsman or trusted third party who is able 
to revoke a user’s anonymity unconditionally. Special provisions to revoke the 
identity of double spenders are no longer required or implemented. Overspending 
a coin is simply one of the events that can trigger anonymity revocation by the 
ombudsman. 

The availability of this general revocation mechanism allows the scheme to 
exploit the challenge-response protocol used during the payment transaction. 
Not all of the bits of the random challenge need to be random. Some of the 
bits can be allocated specific meaning while the rest are random. The challenge 
semantics of a coin describe the functionality of the coin by assigning different 
meanings to different bits of the challenge. It is not possible to alter the challenge 
semantics of a coin once it has been spent. 

One possible application of challenge semantics is to interpret bits of the 
challenge as the amount of the coin which is being spent in a given transaction. 
The bank will accept coins for deposit as before, and credit the accounts of the 
depositor by the amount indicated in the challenge. Spending a coin more than 
once is no longer the trigger event for anonymity revocation. Now overspending 
occurs if one coin is used to transfer more funds than its related value allows. It 
is this event which triggers anonymity revocation by the ombudsman. 

Since the same coin is now allowed to be spent multiple times, the payment 
transactions which share a common coin are linkable. 



5.4 Consequences of Linking Payments 

Constructing unlinkable divisible electronic cash has turned out to be a difficult 
problem. In Tsiounis’s thesis [18], it is left as an open problem to find a way to 
break the linkability between portions of the same coin and therefore construct 
an unlinkable divisible coin scheme. It is conceded that this may not be possible. 

One of the most challenging problems faced by the electronic cash scheme de- 
signers is that of providing exact payment. The divisible electronic cash schemes 
described previously all trade off some degree of anonymity (in the form of link- 
ability of sets of payments) in order to provide divisibility and hence make exact 
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payments. With linkable divisible coins, it is accepted that portions of the same 
coin will be linkable. Payment transcripts are deposited at the bank where pay- 
ments involving portions of the same coin are readily identified as belonging to 
the same customer. 

However, this is not the extent of the problem. If portions of two separate 
coins are used by a customer during a payment for a single purchase, the mer- 
chant will certainly know that both coins belong to the same user. The merchant 
will (knowingly or unknowingly) inform the bank of this relationship as pay- 
ment transcripts routinely contain the merchant’s identity plus time of purchase 
and/or transaction counters. As a result, the set of payments belonging to both 
coins are linked together. Using more than one divisible coin in a payment needs 
to be avoided if the minimum degree of linkability is to be achieved. 

To minimise the length of the chain of linked payments, the customer must 
select a single coin with which to make a particular payment. The coin must 
have sufficient value remaining so as to be suitable for the payment. That is, the 
customer is required to undertake coin management without a priori knowledge 
of the payment order or values required. 

One strategy might be to use a ‘least possible fit’ algorithm whereby the 
smallest valued coin capable of making the payment is selected. Whichever strat- 
egy is employed, it will be the exception rather than the rule that a coin is com- 
pletely spent. Some ‘small’ residual amount will remain which cannot easily be 
used on its own in a payment either now or in the future. In order to prevent 
cross linking of coin payments, this ‘residual’ coin must be retained in the hope 
that a suitable opportunity will arise which will allow the coin to be completely 
spent. This certainly presents a problem - namely, what to do with the residual 
‘shrapnel’ from each coin. 



6 Possible Improvements 

It is useful to consider the RGV scheme as a pseudo-divisible coin scheme as it 
shares the same linkability characteristics as the divisible coin schemes. A similar 
approach was used by Tsiounis [18] in evaluating the relative performance of 
divisible coin schemes. The ‘small’ coins represent static subdivisions of the ‘big’ 
coins. They are similar to the more dynamic subdivisions of the other divisible 
coin schemes. The unspent ‘small’ coins derived from a retired pseudonym pose 
the same problem as unspent portions of divisible coins. 

The customer may choose to solve the problem of residuals (in order to 
maintain the desired level of anonymity) by discarding the troublesome shrapnel. 
This is not likely to be very appealing to customers as they lose value by doing 
this. Furthermore, it will disrupt the reconciliation process at the bank. The 
bank is concerned about the fraudulent creation of value in the system, but the 
unaccounted destruction of value is also a problem. 

The performance of back-end processing subsystems may also be adversely 
affected if coins are allowed to stay in circulation indefinitely. Practical anony- 
mous electronic cash schemes maintain a spent coin database in order to detect 
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previously spent coins. Minimising the size of this database is essential for good 
performance with respect to the back-end accounting checks. For an on-line 
scheme, this is even more crucial. To this end, coins typically have an expiration 
date after which they will not be accepted by merchants. In order that the cus- 
tomer not lose value, schemes must allow coins to be returned to the bank for a 
refund if they expire before being spent. If the residual value of a coin is allowed 
to be returned to a customer’s account, the customer must give their account 
details which reveals their identity. All transactions involving the returned coin 
can then be linked to the user. 

As an alternative to providing a refund, a scheme may refresh the old expired 
coin. Clearly, the refreshed coin would have value equal to the unspent portion 
of the old coin. The new refreshed coin is also anonymously withdrawn and so 
would not be linked to the old coin. The refresh process may also be applied to 
unexpired coins. If the value of the refreshed coin were topped up by a further 
withdrawal of funds by the customer, the customer would have to reveal their 
identity and therefore their association with payments made using the old coin. 
Also note that if more than one coin is refreshed during the same session, the 
relationship between these coins is revealed and their payments are linked. 

We suggest a combination of both refunding and refreshing as a solution 
to this problem. Refreshing a partially spent coin breaks the linkage with the 
previous payment transactions. Returning this refreshed coin to the bank now 
reveals no direct link to previous payments. See Figure 2 for an overview of the 
life cycle of such a divisible coin. 

However, as has been mentioned previously, the timing of these steps needs to 
be considered carefully. Refreshing should not be performed in the same session 
as withdrawal. The problem of linking by complementary amounts identified 
by Brands [3] still applies. Further, this two step process is most likely to be 
conducted with relatively small residual values. This raises questions about the 
cost and efficiency of this process. 

If the customer has cause to regularly purchase items that consume the resid- 
ual values, the above problem may not arise. Perhaps purchasing value in another 
payment scheme such as a micropayment scheme could be considered an effec- 
tive strategy for securing the minimum linkablility of payments. However, these 
solutions fall outside the control of the electronic cash scheme designer. Exist- 
ing divisible cash schemes do not address the issue of avoiding cross linking of 
payments. 

7 Conclusion 

This paper has explored the ramifications of deploying a practical electronic 
cash scheme. We have illustrated that is difficult to balance the disparate forces 
influencing the design process. While many proposed schemes accept that the 
transactions made with the same divisible coin are linkable, we have highlighted 
the difficulty in avoiding linkability between different coins. If this problem is 
not solved a user’s transaction history becomes an unbroken chain. Although 
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we have proposed initial solutions, it remains an open problem to determine an 
acceptable compromise. 
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Abstract. Securing the mobile code commerce is not an easy task 
at all. We propose in this paper a framework to tackle this problem. 
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for that purpose. In a third step, we build a framework based on the 
analysis done. This framework has been implemented to show its validity. 
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1 Introduction 

Computers are more and more interconnected with networks like Internet. This 
high level of connectivity allows to develop new distribution channels for non 
physical components. In the same time, it also opens new ways to breach the 
IPR (Intellectual Property Right) of the owner of those components. 

This is especially the case for software components. It is very easy to sell 
programs over the Internet. It is also easy to copy them. An average user can 
install them on many computers, use them more time that he is allowed to and 
even share it on Internet. With a little more complexity and time, an average 
software engineer can reuse, repackage, resell your component with his own name. 
He can also decompile, steal your proprietary code. This is more and more the 
case with multiplatform language such as Java. 

This interconnected world is full of new opportunities for the software busi- 
ness but it also allows a new range of pirating methods that can kill this business. 

We consider it useful to build a practical system to address those issue. That 
is the purpose of FILIGRANE [7] (European- funded project). FILIGRANE is a 
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security framework that allows secure trading of mobile code on Internet. This 
framework offers many services like registration of mobile code, certification of 
the entities involved, copyright protection, protections against code modifica- 
tions, protections against illegal use and copy. 

We will describe the framework in section 2, the general issues raised in sec- 
tion 3, the techniques used (and their limitations) in section 4 and make an 
overview of the protocols in section 5. As an implementation of this framework, 
we developed a demonstrator that is focused on mobile Java applets and appli- 
cations. 



2 FILIGRANE Functional Architecture 

In this section we will introduce the actors and the main interaction that exists 
between the actors. This is an overview of the architecture of FILIGRANE. 

2.1 Actors 

Below we give the list of actors associated with the FILIGRANE functional 
model. This model is based on the model developed in the IMPRIMATUR 
project [8]. The FILIGRANE system is composed of the following actors: 

— Certification Authority (CA): a Trusted Third Party (TTP), providing ser- 
vices for the creation and distribution of electronic certificates for producer, 
end user, provider and the Rights Glearing House (RCH); 

— Producer: a software developer or company, offering mobile code to a provider 
for e-commerce; 

— Provider: actor providing goods such as software, services and information. 
The Provider sells services and/or electronic delivery of items for sale such 
as software. The Provider negotiates the contract conditions for the use of 
services electronically; 

— End user: an authorized holder of a certificate supported by a GA, and 
registered to perform software download by the FILIGRANE system; 

— Rights Clearing House (RCH): this actor is dedicated to the definition and 
redistribution of rights between actors of the system as the result of a trans- 
action; 

— Fee Collecting Agency: this actor is responsible for collecting funds as the 
result of financial transactions and to redistribute them proportionally to 
the various actors of the system according to the conditions of the associated 
contracts. This operation requires a tight linkage between the Fee Collecting 
Agency and the RCH; 

— Quality Label Service: this optional actor can enter the system, his role is 
qualifying mobile code to be distributed with various quality labels recog- 
nized by potential purchasers; 

— E-notary: this actor will notarize all transactions in the system and act 
as a trusted repository for all actors in order to constitute a full database 
guaranteeing the rights of all actors in the system. 
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2.2 Functional Model 

The focus of FILIGRANE is to build a secure framework for exchanging mo- 
bile code between actors and managing the rights associated with mobile code. 
Therefore, we concentrated on the following steps: Ordering, Contract handling. 
Delivery of goods and Usage control on the end user side. We let alone the pay- 
ment part (left part in fig. 1) as wc can use off the shelf components for this 
part. 



3 General Issues 

Now that we have a clear picture of the architecture of FILIGRANE, we can 
identify the security objectives we need to achieve. This framework should be 
usable in a real world system. We are not doing security for the sake of security. 
Moreover the end user must adopt FILIGRANE. Therefore, we must also identify 
which are the requirements of this end user. 



3.1 Contract Handling 

We must be able to produce automatically a contract between the provider 
and the end user and generate the associated set of rules that will control the 
execution of the program on the end user’s computer. We should not only protect 
the provider against the end user but also the rights of all actors. Therefore the 
contract and the rules must be notarized by the e-notary so that the end user is 
protected against abuse from the provider. 



3.2 Delivery of Goods 

We must provide a mechanism that allows to download the software and the 
associated stuff (contract, rules, . . . ) in a secure way onto the end user system. 
It does not only mean that the data cannot be intercepted but also that the 
end user is convinced that the downloaded data really comes from the provider 
where he purchases the software. 



3.3 Generic Software Protection 

We want to prevent the following actions: 

— Illegal copy: Forbid an unauthorized user to copy the program in another 
computer and still be able to execute it. Note: we do not need to forbid the 
copy as such, we need to avoid that the copy is usable. 

— Illegal use: Forbid an unauthorized user to execute the program. This must 
be the case even if one authorized user installed a program on a computer 
and then an unauthorized user tries to execute it on the same computer. 
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Fig. 1. FILIGRANE functional model 



— Reverse engineering: Forbid a software engineer to decompile the program. 
Once decompiled, proprietary algorithms are exposed. It is therefore really 
important to avoid that. 
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— Illegal reuse: Forbid the reuse of one of the component of the program. 
Nowadays, most of programs are constituted of several functional blocks that 
can be reused in other programs. It must be avoided that a rogue software 
producer is able to replace the user interface of the program and resell it 
with its own brand. 

— Illegal modifications: Forbid any modifications in the shipped program. 

3.4 Usage Control 

Even if an authorized user executes the program, we want to be able to put some 
restrictions on the usage of the program: 

— Duration limitation: Limit the number of hours to run the program. 

— Date limit: Limit the dates when the program can be executed. One inter- 
esting purpose is to avoid running too old programs. 

— Number of uses: Limit the number of uses of the program. 

— Features limit: Restrict some features to some users. 

— Pay per use: Each time the user execute one particular function of the pro- 
gram, he must pay for this use. 



3.5 End User Point of View 

— Technical Requirements. We should minimize the requirements made on the 
end user platform. We should avoid requiring too many adds-on to the end 
user system. Otherwise, the end user will not use FILIGRANE because he 
does not have the proper hardware and software. 

— End User Privacy. This should not be forgotten. The end user does not want 
a central authority to be able to track down all purchases / uses he made. 

— User Friendship. The system must be user-friendly. Otherwise, the end user 
might try once but will abandon because of the too complex system. 

— Cost. The system must also have advantages for the end user. If it is more 
expensive than a classical on, the system is less likely to be adopted. 

Dean [5] states that a non appropriate management of the user point of view 
was at the origin of the failure of the DIVX system. 

4 Software Protection and Their Tools 

We already stated in subsection 3.3 what main objectives we want to achieve. 
However remember that we are not seeking the ultimate protection. It would be 
costly and too annoying for honest users. Basically we want the ’cost’ for the 
pirates to break the protections to be higher than the cost of buying it legally. 
The meaning of ’cost’ can be very wide. In a recent paper [6], Devanbu and 
Stubblebine developed a nice model of the economics of piracy. 

This model can be seen as follow. Suppose that an entity needs n copies of 
the same software item. This entity can either buy (for cost Cb) the item or 
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illegally obtain the item. The first step in that case is to hack the protection 
mechanism (cost Ch), then make n copies (of value Cc) of the item. The law 
intervenes in this model as the entity can get caught (probability Pi) and face 
fines, . . . (cost Ci). Usually those last values {Pi and Ci) may depend on n. The 
reality, actually, is unfortunately that: 

n * Cb » Ch + n * Cc + Pi{n) * Cl (n) 

The objective is therefore to modify the balance so that most people will have 
a strong incentive to buy the software items. Let us first examine quickly the 
non-technical variables. The cost Cb can be reduced. It is the approach chosen by 
the free software industry. The other non-technical issue is the cost Ci . This cost 
is likely to increase in the following years with new regulations like the DMCA 
(Digital Millennium Copyright Act) . The three remaining variables can at least 
partly be addressed with technical solutions. 

This is a fine model but piracy is not limited to illegal use of the software item. 
Piracy can also arise when a rogue software company tries to steal some parts 
of your code or reverse engineer your code in order to extract your proprietary 
algorithm. We need to extend the equation of Devanbu and Stubblebine. We built 
similar models for this kind of piracy. The rogue software company can rewrite 
the component (cost Crw)j steal a component (cost Cg) or reverse engineer a 
component (cost Cg). We want a software company to have a strong economic 
advantage to really develop a component instead of stealing it. Consequently, 
what we want is : 

Crw «Cs + Pi{n) * Ci{n) 

Crw «Cc + Pi{n) * Ci{n) 

Note that Pi and Ci are not exactly the same as in the first equation but in 
a first approximation we consider that they are quite the same. Although Crw 
is a technical cost, we can not really influence it. To achieve the result we want, 
we need to increase Cg and Ce with technical solutions. 

We will now review the technical choices we made. For each solution, we will 
explain our choices, detail their protection and list the remaining attacks. 

One important point we should not miss is the targeted platform. Some weak- 
nesses against piracy can appear with specific run-time environments. Of course, 
others can disappear. We will face different problems in different environments 
like: set-top boxes, smart cards, computers. Even in computers, the run-time en- 
vironment will change the weaknesses: Unix (Linux), Mac OS, Windows, Java, 
. . . The solutions we will show are useful for most platforms but not all. Some 
of them may not be needed in particular cases. As the demonstrator of FIL- 
IGRANE runs inside a Java Virtual Machine (JVM), an emphasis will be put 
on counter-measures that must be applied in that context. 

4.1 Encryption 

Encryption serves two different purposes: Avoid unauthorized execution of the 
mobile code and avoid access to the mobile code (compiled code, resource files) . 
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The user needs to have the decryption key to be able to run the software. 
This is a problem because if the user is a hacker, it will be easy for him to steal 
the key. Therefore the key is stored in a smart card (see next section) and it is 
impossible to retrieve it as such. The key is user not computer dependent. It is 
user-friendly because you can easily change of computer. 

The encryption is a powerful way to modify the economics equation. It in- 
creases the cost of hacking (Ch), the cost of stealing a component {Cg) and the 
cost of reverse engineering (Ce). 



Other Attacks 

— A first attack consists in monitoring the memory and looking for the de- 
crypted code as the code must be decrypted before execution. You need to 
execute all parts of the program before gathering the complete code. If we 
look at the platform (Java) we choose, this attack is more complex. First 
the code is sliced in classes which are decrypted only when they are needed. 
Then, the code is not stored as stream of bytes in the memory, but in a 
structured way. Moreover, usually, the code goes into the just in time com- 
piler that transforms Java bytecode into native code. In the case of Java, we 
do not think that this attack is truly realistic. 

— Second attack: intercept the key. This attack is the most dangerous one. 
Due to bandwidth limitations, we cannot decrypt the code inside the smart 
card. It would be too slow. The decryption takes place inside the computer. 
Therefore the key must be transmitted to the computer and temporarily 
stored inside the computer memory. We are investigating methods to avoid 
this security breach. 



4.2 Smart Cards 

The first arising question is: why a hardware token ? We needed a tamper resis- 
tant box where we can store values securely. The stored values are: The usage 
control value (time limit, duration limit, . . . ); The decryption key of the program 
(from the previous section); A private key linked to the user. 

An hardware ^ dongle^ or a specific media (floppy disk or CD-Rom) does not 
fit our requirements as we are dealing with trading of mobile codes that can 
be downloaded from Internet. Therefore both should be ruled out. The user is 
required to have a smart card and a smart card reader and that is a constraint 
that the user may not accept. We think that this hardware requirement is the 
less annoying. Smart cards are more and more widespread (at least in Europe) 
and support for smart cards is already embedded in recent operating systems 
like Windows 2000. We expect that in a few years time, smart card readers will 
be embedded in the computer as it is already a recommended device in the norm 
PC’99 [4]. 

Nowadays, we can consider smart cards as really tamper resistant boxes. 
Physical attacks are quite complex to mount and need specialized hardware 
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devices [13]. Side channels attacks only retrieve keys used in some peculiar sit- 
uations [11,12]. 

The smart card allows us to increase the cost of hacking the program [Ch) 
and strengthens the security of the encryption of the code. 



Other attacks. To our minds, the emulation of the smart card is the only re- 
maining attack that can be mounted against. Because the smart card is accessed 
via ’drivers’, those can be hacked and branched to a smart card emulator. It is 
even more complex because data are downloaded onto the smart card through 
an encrypted channel (details in section 5.4). You first need to extract those data 
from the smart card before being able to emulate it because the private key used 
to establish the encrypted channel is pre-loaded when the smart card is issued. 

4.3 Watermarking 

Context. Watermarking allows to embed hidden information inside any byte 
stream. Watermarking as such is not a protection mechanism that avoid piracy. 
Rather watermarks are used to protect the rights of the copyright holder of the 
data. Therefore they are used after an illegal operation has taken place. Concrete 
objectives for watermarks are (from [14]): traceability of the data, robustness and 
imperceptibility of the watermark. 

The technology of watermarking has been hrst developed for multimedia 
byte streams like pictures, sounds and video [16] and watermarking techniques 
are often media dependent. 

Applied to code, these objectives/requirements do not really change. But the 
imperceptibility is more crucial in the case of executable code. For images, if 
slightly modified, the general meaning will remain but for code, modifying the 
behaviour is no more an option. So, this constraint will be very important and 
we formalized this condition in [18] (extended form of the definition of [2]) by: 

Definition 1. Transform a program P into a watermarked program P' with 
the same observable behaviour. If P fails to terminate or terminates with an 
error, then P' fails to terminate or terminates with an error. Otherwise, P' 
must terminate and produce the same output as P. 

In [2], Collberg and al. presented a survey of the existing software water- 
marking techniques. The analysis they made is useful because they showed that 
almost all existing techniques are based on tips and tricks. Although those can 
be very efficient, we have no guarantee on the security of those watermarks. We 
must also remember that even in other media, this is not always the case. 

In [18], two of the authors of this current paper and others built a new 
theoretical model and a new type of software watermark based on already tested 
techniques. The main idea is to reuse existing and quite secure techniques already 
used in other domains such as images and audio. The reused technique is the 
spread-spectrum technique. This technique adds a mark (constituted by a vector 
of pseudo-random values with low amplitude) in the frequency domain of the 
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image or the audio signal. This mark is very difficult to remove and alter (see [9] 
for more details) . The frequency vector where the mark is added is composed by 
the frequencies of occurrence of doublets and triplets of instructions. In order to 
add the mark, these frequencies must be modified. The modification is done by 
building a dictionary of equivalent code instructions. With a heuristic algorithm, 
instructions of the code are replaced by an equivalent group of instructions in 
order to approximate the new frequency vector (the original frequency vector + 
the mark). 



Implementation. As it cannot be easily embedded within FILIGRANE, we are 
currently working to improve this technique. The new development achievement 
will be the subject of a specific paper. When implemented, watermarking will 
not prevent piracy but will increase the probability of getting caught (Pi). 

4.4 Obfuscation 

Context. Intuitively, the obfuscation of the code is aimed at modifying the 
compiled code in order to make it non-understandable. We can be more formal 
and use this definition (extended form of the definition in [3]): 

Definition 2. Transform a program P into another program P' harder to re- 
verse engineer with the same observable behaviour. If P fails to terminate or 
terminates with an error, then P' fails to terminate or terminates with an error. 
Otherwise, P' must terminate and produce the same output as P. 

To be complete, we need to define what ’’harder to reverse engineer” means. 
This can have two meanings and both are interesting here: Alter the code to make 
it very difficult to decompile, (target: the decompiler/deobfuscator) and alter the 
code to make it incomprehensible once decompiled, (target: the engineer). 

The first point is really important in the case of Java. Usually, if we have an 
assembly code of the program, the task of the decompiler will be difficult but 
not impossible [1]. As Java is executed in a Java Virtual Machine (JVM), there 
is a Java assembly language which of a higher- level than a classical assembly 
language. This is mainly due to the security and the bug-free properties of the 
JVM. So, it is far easier to decompile Java bytecode [15,17]. 

Code obfuscation also has a useful side effect. If the code is harder to under- 
stand, it means it will be harder to reuse some parts of it too. 

One may argue that encryption will suffice for our purpose. Remember that 
the decryption key will be temporarily present inside the memory. Therefore a 
determined hacker, with a lot of efforts, will get this key and be able to decrypt 
the code. We want that, even in this case, he will not be able to extract any 
useful information. 

Implementation. As the obfuscation is really an important protection tech- 
nique for Java programs, we implemented advanced obfuscation methods in FIL- 
IGRANE. Those methods were hrst described in [3]. We chose an useful subset 
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that encompasses: scramble identifiers, remove debugging information, remove 
non-strictly necessarily information and modify the control fiow of the code by 
introducing goto instruction with misleading code. 

The first three methods are targeted against the reverse engineer but the 
last one is targeted mainly against the decompiler. This is the most efficient one 
because the way we implemented it fools the decompilers and they are unable to 
get a working decompiled code. Some instructions are missing or are misplaced 
after the decompilation. 

Obfuscation does not at all prevent the piracy of the program as a whole 
but it will increase the cost of stealing a component (C^) and the cost of reverse 
engineering (Cg). 



Other Attacks. Of course this method can not totally avoid the decompilation 
of the code. The code has a meaning and the obfuscation should preserve that. 
Therefore, with enough time and resources, it will always be possible to recover 
the code. 

5 FILIGRANE Step by Step 

In this section, we will describe how everything works. We will not go into 
protocols and implementations details. We will just make a basic description 
with the important points. 



5.1 User Registration and Smart Card Delivery 

The user must register in the FILIGRANE system to be able to use it. Once 
registered, he gets a personal smart card with a private key stored inside. This 
smart card will enable him to use the system. 

The privacy issue can be raised here. This is not really the case. The regis- 
tration of the user can be anonymous. The only technical requirement is that 
the user gets a smart card. It is not mandatory that the physical user is linked 
with his smart card. However a link can be made between all purchases made 
with the same smart card. 

5.2 Contract Negotiation 

Before downloading the code, the end user and the provider must establish a 
contract where all details are written. This contract negotiation can take various 
forms. The important thing is that the contract is signed by the e-notary and 
that both (end user and provider) get back a signed version of the contract. 

To guarantee the security of the transaction, we can use a secure connection 
between the provider and the end user. This secure channel is optional but is 
recommended. For simplicity, we choose to rely on a SSL (Secure Socket Layer) 
channel. To avoid duplication of key pairs, the end user key stored in the smart 
card is used to establish the secure SSL link. 
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One of the easiest ways to implement that is the use of web forms. A web 
form with various options appears in the end user browser. The end user chooses 
the options he would like to have and the price is modified accordingly. Once 
the end user is satisfied with the options, he sends the web form back. The 
provider transforms the data into a language describing the options. The server 
that produces the rules linked to the options is called the ERMS (Electronic 
Right Management System) . The ERMS writes those rules in a subset of XML 
called SRDL (Software Right Description Language). With SRDL, we are able to 
specify formally the chosen options. The provider signs the SRDL page and sends 
it to the end user. With his smart card the end user also signs the contract and 
sends the signed version back to the provider. The provider transmits the signed 
contract to the e-notary. The e-notary adds its own signature and a timestamp. 
This last version of the contract is sent back to the provider and will be annexed 
to the mobile code package (explained in section 5.3). 

As we already said, the payment mechanism is not included in this protocol. 
However it can be fairly easily added at the end of the current protocol and with 
the help of the right clearing house. 



5.3 Mobile Code Packaging 

Once the contract is signed, the mobile code must be packaged according to the 
terms of the contract. The packaging also encompass the addition of security 
mechanisms that were explained in section 4. 

The watermarking is the first step of the packaging. Depending on the pur- 
poses, it can be done once and for all to prove ownership or it can be done 
dynamically with user dependent data to trace traitors. 

After the watermarking step, the mobile code is obfuscated. There is a com- 
plex link between watermarking and obfuscation. Obfuscation destroys a lot of 
meaningful information and therefore is likely to alter the watermark. In fact, the 
obfuscation is a very good attack against software watermarks. If the watermark 
survives - that must be the case - then it will be harder to remove. 

The obfuscated and watermarked mobile code is then encrypted with a sym- 
metric algorithm for performance. The encryption key is user-dependent. 

Beside the code itself, other data are also packaged. The signed contract and 
the rules stored in SRDL and in clear text are also put in the package. 

A final label is added. This label contains useful information like code name, 
provider name, date, version, . . . 

The package containing the mobile code (obfuscated, watermarked and en- 
crypted), the contract, the rules and the label is signed by the provider. This 
will guarantee the origin and the integrity of the package. 

A smaller package for the smart card must also be made. This package con- 
tains the decryption key for the mobile code and initialization values for the 
rules (number of usage, duration limit, ...). 
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5.4 Software Downloading 

The downloading must be divided into two different operations. The first one is 
the classical software download within the web browser. This classical step can 
be done within a secure connection but this is not required. Remember that the 
software is encrypted and that the encryption key is not present in the package. 

The second and most important part is the download of critical data inside 
the smart card. Those critical data (rules and decryption key) must be directly 
downloaded within the smart card. 

Usually this step is done by first downloading data inside the computer and 
then sending those data inside the smart card. For normal data, there is no 
problem but here we have a security issue. If the end user can eavesdrop the data 
and store them, he can resend those data once the limits have expired. Moreover 
he can even retrieve the encryption key. Therefore this is not an option for us. 

The solution is to avoid any trust in the computer. A secure link must be 
established between the smart card and the provider. This secure link must be 
end-to-end encrypted and the computer of the end user only plays the role of 
a proxy for the encrypted data stream. Our objective is that the computer can 
not access the decrypted stream. 

As we use Java cards, we can use some existing mechanisms in Java to build 
such scheme. We built a RMI (Remote Method Invocation) proxy inside the 
computer that only relays an encrypted RMI channel between the provider and 
the smart card. With this solution, the provider can interact with the smart card 
without any data leakage in the end user computer. 



5.5 Secure Execution 

All previous steps are almost useless if execution of the mobile code is not con- 
trolled. An important part is therefore the secure execution of the mobile code. 
We have two major requirements here: 

— We want to avoid unauthorized access to code. As we took a lot of precautions 
to avoid illegal copy, illegal reverse engineering, ... up to now, we need to 
extend those protections when the mobile code is executed. The challenge 
here is the encryption because the encryption is the only protection that 
must be undone to be able to execute the code. 

— We want to avoid unauthorized transgression of the rules associated with the 
code. This means that the code can only be executed within our environment 
and that nobody can bypass the security controls we will add. 

Both are linked at some point because if you are able to decrypt the mobile 
code, it will be easier to execute it outside our secure environment. 

We customize a JRE (Java Run-time Environment) by adding the FIL- 
IGRANE security engine. We are not going to describe all parts of the engine in 
the detail but rather we will explain how it works. A schematic view is showed 
in fig. 2. 





signals 



Fig. 2. Secure execution 



The FILIGRANE JRE is started. When a normal code (not protected) is 
encountered, it processes it as a standard code without any intervention of the 
security engine. When a FILIGRANE protected mobile code is launched, the 
security engine takes the control. Note that when a condition is not met and the 
resulting action is not mentioned in the rest of the section, it implies that the 
mobile code execution is stopped. 

— First step. The security engine checks for the presence of a FILIGRANE 
smart card. If the card is inserted, it verifies the presence of rules and keys 
linked to the mobile code to be executed. Parts of the rules linked with 
the mobile code are checked before the execution of the mobile code. Eor 
example, if the duration limit has expired, there is no reason to start the 
mobile code. 

— Second step. The rules manager (the ERMS client) is started and it receives 
all rules linked with the code. The ERMS client will run in a parallel thread 
and will dynamically check all rules. This ERMS client will be stopped once 
the mobile code ends. If one of the conditions is met, a signal is sent to 
the security engine that will take the appropriate action (stop, freeze, send 
message, . . . ). 

— Third step. The mobile code itself is launched. The code is loaded from 
the disk (in Java, class by class). The decryption key is extracted from the 
smart card and the mobile code is decrypted and then executed. As already 
mentioned in section 4.1, this is the weakest link because the decryption key 
will be temporarily in clear inside the memory. 
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One straightforward attack that can be imagined against the FILIGRANE 
security engine is the hacking of the security engine itself in order to disable the 
rules manager or to recover the key. This is really an important issue. 

Although the security engine is not tamper resistant, we are not far of. Be- 
cause we are working inside a JVM, we benefit from the already existing security 
infrastructure of the JVM. An insider attacker (we mean an attack with a Java 
application) is quite impossible to mount. Two applications cannot interfere in 
Java because of the sandbox it imposed. 

An external attacker would be a little more successful but this remains com- 
plex. He can try to modify the classes of the security engine. This will be difficult 
because those classes are signed and the signature will be verified by the JVM. 
To be effective, he also needs to hack the JVM itself to avoid the signature verifi- 
cation. Another method is to try to access the smart card directly. Although the 
security engine authenticate itself to the smart card, it should not be impossible 
to simulate this authentication. If successful, it only enables the recovery of the 
keys because the security engine itself cannot raise the value of the counters, 
only the provider can do it. One last method - the most effective - is to dump 
the memory to get the decryption key. 



6 Implementation 

When we designed this security infrastructure, we focused on a practical security 
infrastructure for mobile code commerce and not on a theoretical one that would 
be nice on paper but unusable in practice. Therefore, we were able to build a 
quite complete working demonstrator of our infrastructure. 

The demonstrator consists of a server that merges two roles (the producer and 
the provider) that sells Java applications in a web server and a end user computer 
with a smart card. The contract negotiation, the mobile code packaging, the 
software downloading and the secure execution described in the previous sections 
are implemented as such except the e-notary and the watermarking. 



7 Conclusion 

As we have seen, the e-commerce of mobile code is not obvious to protect due 
to the various issues to be addressed. Many techniques can be used for that 
goal. However if we want a usable system that is quite user-friendly, we cannot 
guarantee a 100% security. With these basic security blocks, we built a working 
framework that does not loose too much security. 
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Abstract. The Logical Key Hierarchy (LKH) scheme and its derivati- 
ves are among the most efficient protocols for multicast key management. 
Traditionally, the key distribution tree in an LKH-based protocol is or- 
ganized as a balanced binary tree, which gives a uniform O(logn) com- 
plexity for compromise recovery for an n-member group. In this paper, 
we study improving the performance of LKH-based key distribution pro- 
tocols by organizing the LKH tree with respect to the members’ rekeying 
probabilities instead of keeping a uniform balanced tree. We propose two 
algorithms which combine ideas from data compression with the special 
requirements of multicast key management. Simulation results show that 
these algorithms can reduce the cost of multicast key management signi- 
ficantly, depending on the variation of rekey characteristics among group 
members. 



1 Introduction 

One of the biggest challenges in multicast security is to maintain a group key 
that is shared by all the group members and nobody else. The group key is used 
to provide secrecy and integrity protection for the group communication. The 
challenge of maintaining such a group key becomes greater when the groups are 
large and highly dynamic in terms of membership. 

Currently, the most efficient methods for multicast key management are ba- 
sed on the Logical Key Hierarchy (LKH) scheme of Wallner et al. [18] (also inde- 
pendently discovered by Wong et al. [19]). In LKH, group members are organized 
as leaves of a tree with logical internal nodes. The cost of a compromise recovery 
operation in LKH is proportional to the depth of the compromised member in 
the LKH tree. The original LKH scheme proposes maintaining a balanced tree, 

* This research was supported in part by the Department of Defense at the Maryland 
Center for Telecommunications Research, University of Maryland Baltimore County. 
The views and conclusions contained in this document are those of the authors and 
should not be interpreted as representing the official policies, either expressed or 
implied, of the Department of Defense or the U.S. Government 



J. Pieprzyk, E. Okamoto, and J. Seberry (Eds.): ISW 2000, LNCS 1975, pp. 179—193, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 




180 A. A. Selguk and D. Sidhu 



which gives a uniform cost of O(logn) rekeys for compromise recovery in an 
n-member group. 

In this paper, we study improving the performance of LKH-based key dis- 
tribution protocols by organizing the LKH tree with respect to the members’ 
rekeying probabilities instead of keeping a uniform balanced tree. This problem 
was first pointed out by Poovendran and Baras in Crypto’99 [15]; but no solu- 
tions have been proposed to date. We propose two algorithms which combine 
ideas from data compression with the special requirements of multicast key ma- 
nagement. Simulation results show that these algorithms can reduce the cost 
of multicast key management significantly, depending on the variation of rekey 
characteristics among group members. 

We first start by an analysis of the probabilistic LKH optimization problem 
(Sections 2, 3). Then we describe two algorithms for this problem and discuss 
their design rationale (Sections 4, 5, 6). We summarize the simulation results in 
Section 7. In Section 8, we conclude with a discussion of the issues regarding an 
effective utilization of these algorithms. 

1.1 Related Results 

Group key establishment protocols can be classified as (contributory) group key 
agreement protocols and (centralized) group key distribution protocols. Most 
group key agreement protocols are multi-party generalizations of the two-party 
Difiie-Hellman key agreement protocol [5,17,16]. They have the advantage of 
doing without an active key management authority; but they also require quite 
intensive computation power, proportional to the size of the group. Therefore, 
group key agreement protocols are mostly used for relatively smaller groups 
(i.e. with 100 members or less). 

In Internet multicasting, groups are typically large, and there is an active 
group manager available. Therefore, most multicast key management protocols 
are based on centralized key distribution protocols. In the Group Key Mana- 
gement Protocol (GKMP) of Harney et al. [9], each group member obtains the 
group key by a unicast communication with the group key manager. This proto- 
col has the disadvantage of having to re-initialize the whole group when a mem- 
ber is compromised (possibly due to a departure). A similar but more scalable 
protocol is the Scalable Multicast Key Distribution (SMKD) protocol proposed 
by Ballardie [3]. In this protocol, the key manager delegates the key manage- 
ment authority to routers in the Core-Based Tree (CBT) multicast routing. The 
protocol has the disadvantage of requiring trusted routers and being specific 
to the CBT routing protocol. The lolus protocol [14] deals with the scalability 
problem by dividing the multicast group into subgroups. Each subgroup has 
its own subgroup key and key manager, and rekeying problems are localized to 
the subgroups. The multicast group is organized as a tree of these subgroups, 
and translators between neighbor subgroups help a multicast message propagate 
through the tree. A similar approach is a group key management framework pro- 
posed by Hardjono et al. [8], where the group members are divided into “leaf 
regions” and the managers of leaf regions are organized in a “trunk region” . The 
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key management problem is localized to the regions, and inter-region communi- 
cation is maintained by “key translators” . This framework provides a scalable 
solution for key management in large multicast groups. 

Currently, the most efficient multicast key distribution protocols which enable 
all group members to share a common key not known to anyone outside the group 
are based on the Logical Key Hierarchy (LKH) protocol and its variants. LKH- 
based protocols, as will be discussed in more detail in Section 2, have the ability 
to rekey the whole group with O(logn) multicast messages when a member is 
compromised. The LKH structure was independently discovered by Wallner et 
al. [18] and Wong et al. [19]. Modifications to the basic scheme which improve 
the message complexity by a factor of two with a relatively small computational 
overhead have been proposed in [13,6]. Certain similarities between LKH trees 
and some information-theoretic concepts have been pointed out in [15]. 

Another different class of group key distribution protocols is the Broadcast 
Encryption protocols [7] . These protocols guarantee the secrecy of the key against 
coalitions of up to a specified number of outsiders. Luby and Staddon [12] prove 
a lower bound for the storage and transmission costs for the protocols in this 
class, which is prohibitively large for most cases. 



1.2 Notation 



The following notation is used throughout this paper: 
n number of members in the group 
Mi ith member of the group 
di depth of Mi in the LKH tree 

Pi probability of Mi being the next member to cause 
a rekey (due to a departure, compromise, etc.) 

All logarithms are to the base 2, and i in summations . 
unless otherwise is stated. 



ranges from 1 to n, 



2 The LKH Scheme 

The LKH scheme organizes the members of a multicast group as leaves of a key 
distribution tree where the internal (non-leaf) nodes are “logical” entities which 
do not correspond to any real-life entities in the multicast group and are only 
used for key distribution purposes. There is a key associated with each node 
in the tree, and each member holds a copy of every key on the path from its 
corresponding leaf node to the root of the tree. Hence, the key corresponding to 
the root node is shared by all members and serves as the group key. An instance 
of an LKH tree is shown in Figure 1. 

In this figure, member Mi holds a copy of the keys ATooo 7 A'oo, Kq, and Knoot] 
member M 2 holds a copy of Kooi, Kqo, Kq, and Knoou and so on. In case of a 
compromise, the compromised keys are changed, and the new keys are multicast 
to the group encrypted by their children keys. For example, assume the keys 
of M 2 are compromised. First ATooi is changed and sent to M 2 over a secure 
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Fig. 1. An example LKH tree with eight members. Each member holds the keys on 
the path from its leaf node to the root. Knoot is the key shared by all group members. 



unicast channel. Then A'oo is changed; two copies of the new key are encrypted 
by A'ooo and Kqqi and sent to the group. Then Kq is changed and sent to the 
group, encrypted by Kqq and AToi; and finally Knoot is changed and sent to the 
group, encrypted by Kq and Ki. From each encrypted message, the new keys 
are extracted by the group members who have a valid copy of either one of the 
(child) encryption keys. 

If the security policy requires backward and forward secrecy for group com- 
munication (i.e. a new member should not be able to decrypt the communication 
that took place before its joining, and a former member should not be able to 
decrypt the communication that takes place after its leaving) then the keys on 
the leaving/joining member’s path in the tree should be changed in a way similar 
to that described above for compromise recovery. 

Although an LKH tree can be of an arbitrary degree, most efficient and 
practical protocols are obtained by binary trees, and studies in the field have 
mostly concentrated on binary trees [18,13,6]. We follow the convention and 
assume the LKH trees are binary. We also assume that the binary tree is always 
kept full (i.e. after deletion of a node, any node left with a single child is also 
removed) . 

3 Probabilistic LKH Optimization 

The problem addressed in this paper is how to minimize the average rekey cost 
of an LKH-based protocol by organizing the LKH tree with respect to the rekey 
likelihoods of the members. Instead of keeping a uniform balanced tree, the 
average rekey cost can be reduced by decreasing the cost for more dynamic 
(i.e. more likely to rekey) members at the expense of increasing that cost for more 
stable members. This can be achieved by putting the more dynamic members 
closer to the root and moving more stable members further down the tree. 

The rekey operations caused by a periodic key update or a joining member 
can be realized by a single fixed-size multicast message using a one-way func- 
tion [13]. In this study, we will concentrate on the more costly rekey operations 
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that are caused by a member compromise or eviction event. The communication 
and computation costs of these rekey operations are linearly proportional to the 
depth di of the compromised (or, evicted) member, as adi + 6, o > 0. The exact 
values of a and b depend on the specifics of the LKH implementation. 

Finding the optimal solution to this problem that will minimize the average 
cost of all future rekey operations is not possible in practice since that would 
require the knowledge of rekey probability distributions for all current and pro- 
spective members of the group as well as the cost calculations for every possible 
sequence of future join, leave, and compromise events. Instead, we concentrate 
on a more tractable optimization problem, that is to minimize the cost of the 
next rekey operation. The expected cost of the next rekey operation, due to a 
leave or compromise event, is equal to 

( 1 ) 



where pi is the probability that member Mi will be the next to be evicted/com- 
promised, and di is its depth in the tree. This problem has many similarities to 
the data compression problem with code trees where the average code length 
per message is This quantity YliPidi is known as the average external 

path length of the tree, where pi is the probability of message mi to be the next 
to appear, and di is its depth in the code tree. The optimal solution for the 
problem of minimizing the average external path length is given by Huffman 
trees [4]. Shannon-Fano trees are another alternative solution which give very 
good compression in practice but are slightly sub-optimal [4]. 



3.1 Differences from the Data Compression Problem 

In an LKH key distribution tree, a change in the tree structure, such as changing 
the location of an existing member in the tree, causes extra rekey operations, 
which adversely affects the objective function (i.e. minimizing the average num- 
ber of rekeys) . On the other hand, in a data compression tree, a structural change 
does not directly induce an overhead on the objective function (i.e. minimizing 
the average code length). So, changes in the tree structure can be freely utilized 
in data compression algorithms, such as dynamic Huffman algorithms [10], to 
maintain the optimal tree structure; whereas they cannot be utilized so freely in 
dynamic LKH algorithms. Therefore, an LKH scheme with sub-optimal pjdj 
can have a better overall performance than one that keeps ^iPidi minimal all 
the time. 

Another difference of LKH trees from data compression trees is that, if mem- 
ber evictions are the main reason for rekey operations (i.e. if very few compromise 
events happen other than member evictions) , then each member in the tree will 
cause a single rekey operation while it is in the tree. 
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4 Design Rationale 

As discussed above, finding the optimal solution that minimizes the average num- 
ber of rekey messages over all future sequences of join, leave, and compromise 
events is not possible in practice. Therefore, we focus our attention on mini- 
mizing the expected cost of the next rekey event, YhiPidi- The proven optimal 
solution for minimizing is given by a Huffman tree. However, maintai- 

ning a Huffman tree requires changes in the locations of the existing members 
in the tree, which means extra rekey operations. We choose to avoid this kind 
of extra rekey operations and concentrate on algorithms which do not require 
changing the location of the existing members. 

Given the condition that the locations of existing members will not be chan- 
ged in the tree, the main structural decision for the tree organization is where 
to put a new member at insertion time. Also, the insertion operation should ob- 
serve the current locations of existing members. That is, the keys each member 
is holding after an insertion operation should be the same as those it was hol- 
ding before the insertion (or the corresponding new keys, for the keys that are 
changed), plus possibly some newly added keys to the tree. Therefore, we will 
focus on insertion operations of the form illustrated in Figure 2, which preserve 
the relative location of present members. 



Root Root 

Y Y 

\ Put(M.X) \ 




M X 



Fig. 2. The Put procedure. Relative location of existing nodes are kept the same to 
avoid extra rekey operations. 

That is, to insert a new member M into the group, a new internal node N is 
inserted at a certain location in the tree, and M is linked underneath. To denote 
this insertion operation at a given location X for a given new member M , we 
will write Put{M, X). Note that the traditional LKH insertion, where every new 
member is inserted as a sibling to a leaf node, is a specific case of Put{M,X) 
where X is a leaf node. 

In our probabilistic LKH trees, each node X in the tree has a probability field 
X.p that shows the cumulative probability of the members in the subtree rooted 
at X , similar to that in Huffman trees (i.e., X.p is equal to the probability of the 
corresponding member if A is a leaf node, and it is equal to X.left.p + X.right.p 
if X is an internal node). The Put procedure shown above also updates the p 
field of all nodes affected by the insertion as well as setting up the appropriate 
links for M and N. 
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5 Insertion Algorithms 

In this section, we describe two LKH insertion algorithms which seek to minimize 
the expected number of rekeys for the next member eviction or compromise 
event. The first algorithm does not induce any additional computational cost over 
the basic balanced-tree LKH insertion. The second algorithm provides further 
improvement over the first algorithm in message complexity but induces an 0{n) 
computational overhead for an n-member group. 

Algorithm 1: The first algorithm, Insert i, organizes the LKH tree in a 
way which imitates the Shannon-Fano data compression trees. In Shannon-Fano 
coding [4] , a tree is constructed from a given set of probabilities by dividing the 
set into two parts of (roughly) equal probability repeatedly until every set inclu- 
des a single element. Shannon-Fano coding guarantees a maximum redundancy 
of 1; i.e. Y^iPidi < -DiPilogpi -I- 1, for = 1- Even though finding the 

best partition is NP-hard, there are many partition heuristics that maintain the 
redundancy bound of 1. The fundamental principle of Inserti is to insert a new 
node in a way which obtains the best partitioning at every level so that the re- 
sulting tree will have an average external path length close to the optimal bound 
of — X^jPjlogpi. The algorithm is described in Figure 3. To insert member M 
in a tree with root node R, the procedure is called as Inserti{M, R). 

Inserti (member M, node X): 

if (M.p > X.left.p) and (M.p > X.right.p) 

Put{M,X)- 

else if {X.left.p > X.right.p) 

Inserti{M, X.right)-, 

else 

Inserti{M, X.left)-, 



Fig. 3. Algorithm Inserti. It tries to keep the subtree probabilities as balanced as 
possible at every level. 

Algorithm 2: The second algorithm, Insert 2 , finds the best insertion point 
for member M by searching all possible insertion points in the tree. The amount 
of increase in the average external path length that will be caused by Put{M, X) 
at node X of depth d is equal to d M.p + X.p. Insert 2 searches the whole tree 
to find the location which minimizes this quantity. In Figure 4, d{X) denotes 
the depth of node X in tree T. 

Computational performance of Insert 2 can be improved by taking shortcuts 
in finding Amin- For example, when X.p < M.p the subtree under X need not be 
searched. More sophisticated shortcuts which improve the performance further 
are also possible. But in the worst case, Co.st[X] should be computed for all 
nodes in the tree. Nevertheless, the formula for Cost[X\ is quite simple and can 
be computed quite efficiently. So, when the computation power of the server is 
plentiful compared to the bandwidth, Insert 2 can be the method of choice which 
obtains improved reduction in number of rekeys at the expense of computational 
cost. 
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7nseri2(member M, tree T): 



dostmin t OO 
For each X E T do 

Cost[X] ^ d{X)M.p + X.p 
if C0St[X] < CoStmin 
A^min X 
CoStmin C OSt[X] 
Put{M, Amin) 



Fig. 4. Algorithm Insert 2 - It searches the whole tree for the insertion location that 
would minimize the increase in the average external path length of the tree. 

6 Weights Other Than Probabilities 

To use the insertion algorithms as described above, it is crucial to know the 
Pi values of all members in the tree at insertion time. This requirement is not 
practical since computing the pi values would require the knowledge of the rekey 
time probability functions for all members in the tree. Moreover, even if the rekey 
time probability functions are known for all members, the pi values will change 
continuously as members stay in the group (unless the probability functions are 
memoryless) which further hinders the usage of actual probability values for 
insertion. 

In this section, we discuss an alternative weight assignment technique to 
use with the insertion algorithms. First we note that the Inserti and Insert 2 
algorithms, as well as Huffman and Shannon-Fano coding, can dispense with the 
restriction that '^iPi = 1 and can work with any non-negative weights Wi, as 
long as the relative proportions are kept the same. Corresponding pi values that 
satisfy = 1 can be calculated as pi = WijW, where W = Wi. 

The weight assignment of our choice for the insertion algorithms is the inverse 
of the mean inter-rekey time of members; i.e., 

Wi = l//ij (2) 

where pi is the average time between two rekeys by member Mi. There are 
two reasons for our choice of l/pi as the weight measure among many other 
candidates: 

1. Its simplicity and convenience 

2. In the special case where the members’ inter-rekey time distributions are 
exponential, pi = WijW gives exactly the probability that Mi will be the 
next member to rekey. 

Moreover, the estimation of the pi values can be done quite efficiently from 
the average of past rekey times of members, which should be maintained if a 
probabilistic LKH organization will be implemented. In fact, maintaining only 
the average value of past intcr-rckcy times along with their count is sufficient to 
estimate the mean inter-rekey time. More sophisticated estimates of pi can be 
obtained by further analysis of the members’ behavior. 





Probabilistic Methods in Multicast Key Management 187 

7 Simulation Experiments 

We tested the performance of Inserti and Insert 2 with a large number of com- 
puter simulations. The simulations are run in four different scenarios, classified 
with respect to a number of group characteristics. In one division, the multicast 
groups are classified as terminating vs. non-terminating groups. Terminating 
groups exist for a specified time period, whereas the lifetime of a non-terminating 
group is practically infinite. In another division, groups are classified as dynamic 
vs. semi-static groups. In dynamic groups, members join and leave the group 
continually and the main source of rekey operations is the leaving members. In 
semi-static groups, a joining member stays in the group, or keeps the access 
rights, till the end of the session. In this case, all rekeys are due to compromised 
members. Semi-static groups are typically terminating groups. Another factor 
in the classification of the simulations is the joining time of the group members. 
Members either join the group at a constant rate, according to an exponential 
inter-arrival time distribution, or they all join at the beginning of the session. 

There are two main sources of randomness in the simulations regarding the 
rekey times of group members: 

1. Variation among group members. Mean inter-rekey time, i.e. the average 
time period between two rekey events by a member, varies among group 
members. The mean inter-rekey time values, denoted by fii for member Mi, 
are distributed according to a source probability distribution function, Ds, 
with a mean value of /xg. 

2. Randomness within a member. The time of the next rekey event by each 
member is a random variable, distributed by a rekey probability distribution 
function, Du, with mean pi for member Mi. 

So, when a new member Mi joins a group in the simulations, first it is assigned 
a Pi value from Ds, and then it generates the times of future rekey events 
according to pi and Dr. Regarding the variance of the distributions Ds and Dr, 
a coefficient c„, called the variance factor, is used which denotes the standard 
deviation of a distribution in terms of its mean, i.e. a = Ca-p. 

The following list summarizes the notation used for the group parameters: 

T lifetime of the session 
Xa arrival rate of new members 

Ds source probability distribution function for pi values 
ps mean value for Ds 

The following list summarizes the notation used for the rekey time of individual 
members: 

ti next rekey time for member Mi 
Pi mean inter-rekey time for member Mi 
Du probability distribution function for the inter-rekey time of individual 
members 

In the simulations, wc used many different distribution functions and many 
different variance factors for Ds and Dr. The tests showed that the form of Dr 
(i.e. its being normal, uniform or exponential) and its variance have very little 
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effect on the performance results. The tests also showed that the single most 
important factor for the performance of the probabilistic LKH algorithms is the 
variance factor of Ds (i.e. variation among group members), quite independent 
of the form of the function for Ds- Unless otherwise is stated, the presented 
simulations use the normal distribution for Ds and with a fixed variance 
factor of 0.5 for which is a good representative of the average case. In the 
following tables, c„ is used exclusively to denote the variance factor of Ds- 

7.1 Simulation Results 

During each simulation run, three LKH trees are maintained for the multicast 
group, one for each insertion algorithm. The performance of each algorithm is 
calculated as the number of keys updated by member compromise and eviction 
events. The tables present the number of key updates in the trees of Inserti 
and Insert 2 as a fraction of the key updates in the basic balanced LKH tree. 
The presented results are the averages obtained over one hundred randomly 
generated simulation runs. Ii and I 2 denote Inserti and Insert 2 respectively. 

Table 1. Test results for Scenario 1. The results show that the algorithms provide 
significant reductions in rekey costs when there is a significant variation among the 
rekey rates of the members (i.e. large values of Co-). 
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Scenario 1. The first scenario we consider is a terminating, semi-static group, 
where all members join the group at the beginning of the session. The important 
parameters for this scenario are the size of the group, n, the lifetime of the session, 
T, and the average inter-rekey time of the members, /is. In fact, nominal values 
of T and /is do not matter and the important parameter is their ratio T/fis, 
which we denote by ct- Roughly speaking, ct denotes the number of rekeys an 
average member would cause during the lifetime of the session. The results are 
summarized in Table 1. 

Table 2. Test results for Scenario 2. The results resemble those in Table 1. The al- 
gorithms provide significant improvements when there is a significant variation among 
the rekey rates of the members (i.e. large values of Ca). 
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(c) T = 10^ 



Scenario 2. In the second scenario, we again consider a terminating, semi-static 
group. But this time new members keep joining at a constant rate till the end of 
the session. Important parameters for this case include the lifetime of the session 
in terms of the mean inter- arrival time, T/(l/A. 4 ), and in terms of the average 
inter-rekey time of the members, T / jis- For simplicity, we take the average inter- 
arrival time 1/A^ as the unit time, so T denotes T/(l/A. 4 ). T//rs is denoted by 
Ct- The results are summarized in Table 2. 
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Scenario 3. The third scenario we consider is a terminating, dynamic group. 
It is similar to Scenario 2 except that members may leave the group before 
the end of the session. All rekeys are due to leaving members and there are no 
additional compromise events. Hence, inter-rekey parameters such as fis and 
should be interpreted as parameters for the member lifetime (i.e. time of stay in 
the group). The test parameters are similar to those in Scenario 2. The results 
are summarized in Table 3. 

Table 3. Test results for Scenario 3. The improvement rates are more significant for 
smaller values of Ct- The source of the difference from Scenario 2 is that, in this scenario, 
all rekeys are due to member departures, so each member causes at most one rekey 
event. Hence, when the session is significantly longer than the average member stay 
time (i.e. larger values of Ct), differences among expected member stay times become 
less important. 
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Scenario 4. In the fourth scenario, we consider a long-term dynamic group. The 
session lifetime T is practically infinite. Members join and leave the session at a 
certain rate. All rekey operations are due to departing members. The important 
parameter in this case is the average member lifetime in terms of the average 
inter-arrival time, jjLs / {^1 \a)- In the steady state [11], the departure rate is equal 
to the arrival rate, and hence, the group has an average of n = members. 
The measurements are taken over 10,000 consecutive rekey operations in the 
steady state. Again, 1/\a is taken as the unit time. The results are summarized 
in Table 4. 
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Table 4. Test results for Scenario 4. The improvement rates are quite modest compared 
to those for Scenarios 1, 2 and 3. 
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7.2 Comments on Results 

The test results show that the algorithms for probabilistic LKH organization 
make the biggest difference when there is a sigiiifieaiit variance among the rekey 
rates of the group members (i.e. large in Ds) and also when there are com- 
promise events in addition to those caused by leaving members (e.g. Scenarios 1 
and 2). In these cases, the algorithms provide up to 40% reduction in the cost of 
rekey operations. Larger group sizes contribute positively to the reduction rates 
as well. 

When the main source of rekeys is leaving members, the algorithms provide 
significant gains if average member inter-rekey time (is is close to the session 
time or longer (i.e. the smaller values of ct in the third scenario); because in 
this case most rekey events come from short-living members allocated closer to 
the root of the LKH tree. When the session time is significantly longer than /zg 
and the main source of rekeys is member evictions (Scenario 4, or larger values 
of Ct in the Scenario 3), members allocated deeper in the tree also contribute to 
the rekey events, and the improvement rates obtained by the algorithms drop to 
5% or less. 

We would like to note that in all these simulations it is assumed the over- 
all space of potential members is so large that the distribution of the joining 
members is unaffected by the leaving members (or, the current members) of the 
group, which is practically equivalent to the case where leaving members ne- 
ver return. So, our simulation scenarios do not represent the cases where a few 
very dynamic members can affect the rekey dynamics significantly by frequent 
join and leave operations. In such cases, the probabilistic insertion algorithms 
can provide very significant gains even if all rekey operations are due to leaving 
members. Such gains are not reflected in the results of simulation scenarios 3 
and 4, where all rekeys are due to leaving members, since the incoming member 
parameters in the simulations are independent of those who have left the group. 

Finally, it is interesting to note that the improvement figures obtained by 
Inserti, which does not induce any additional computational cost over the basic 
balanced-tree LKH insertion, are consistently very close to those obtained by 
Insert^, which searches the whole tree for the best insertion point. This, in our 
opinion, indicates the strength of the basic idea underlying Inserti, that is to 
keep the subtree probabilities as balanced as possible. 
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8 Conclusions 

In this paper, two algorithms are described which can reduce the cost of mul- 
ticast key management significantly, depending on certain characteristics of the 
multicast group. The algorithms described here are not specific to the basic LKH 
scheme of Wallner et al. [18] but are also applicable to more sophisticated LKH- 
based techniques such as the OFT scheme of McGrew and Sherman [13] and the 
OFC scheme of Canetti et al. [6] . The algorithms can work with relatively small 
computational overhead (and no overhead in case of Inserti) and can provide 
significant reductions in the message complexity of rekey operations. The im- 
provements can be significant or modest, depending on certain characteristics 
of the group, as suggested by the simulations on different multicast scenarios in 
Section 7. 

The requirement of using actual probabilities for tree organization can be a 
major limitation for probabilistic LKH organization techniques. Instead of using 
actual probabilities, we suggest using a heuristic weight assignment technique for 
the tree organization, which is described in Section 6. Simulations summarized 
in Section 7, which were implemented with this weight assignment technique, 
show that this heuristic method works effectively in practice. 

The studies of Almeroth and Ammar [1,2] about the behavior of multicast 
group members in the MBone show that significant differences may exist among 
the members of a group. When significant differences exist among the group 
members and it is practical to maintain data regarding past behavior of the 
members, we believe the algorithms discussed in this paper can provide signifi- 
cant reductions in the cost of rekey operations in multicast key management. 

Acknowledgments. We would like to thank Eric Harder and Chris McCubbin 
for many helpful suggestions and informative discussions. 
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Abstract. We propose a simple classification method for public-key based 
authentication protocols, which consists of identifying several basic properties 
leading to a large number of generic prototypes for authentication. Most 
published protocols can be identified as a concrete instance of one of the 
generic types. The classification method provides a means to clarify the 
similarities and differences between different concrete protocols. This 
facilitates avoidance of previous mistakes when designing a new protocol and 
allows re-use of analysis of a given abstract protocol when classifying any 
given concrete protocol. 



1 Introduction 

Authentication may be considered as one of the most critical elements of 
cryptographic security protocols; in some sense we can consider most cryptographic 
protocols as various extensions of the relevant authentication protocols. Most research 
in authentication protocols focuses either on a generic methodologes of analysis (e.g., 
formal logic approaches [4]) or design of novel protocols for a particular set of 
requirements from application environments. In this paper we propose a simple, but 
nevertheless very useful, approach toward authentication protocol design and 
analysis: classification of authentication protocols. 

With so many authentication protocols published and analysed (and often having 
turned out to be weak) it may be hard to believe that we still do not have a well- 
established classification methodology for them. This is probably because most 
protocols are described in detailed mathematical notation. Secure authentication 
protocols rely upon sound integration of underlying cryptographic primitives and the 
series of message exchange between protocol principals. Consequently, the analysis 
of any given authentication protocol will require examining three points: underlying 
algorithms used in the protocol; the procedure of message exchange; and the secure 
combination of these two ingredients. We only focus on the second point, namely 
message exchanges in authentication protocols. This allows us to abstract away from 
the details of the cryptographic mechanisms and concentrate instead on the 
fundamental protocol structure. 
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2 Concepts, Definitions, and Notations 

The conventional ways to describe authentication protocols depend heavily on 
mathematical terms such as discrete exponentiations and hash functions. Even 
abstract level notations are sometimes not free from these mathematical languages. 
When trying to establish a method of classification, we have noticed that we need a 
new notation which is free from mathematical functions or very specific notions of 
security services such as digital signature. The declaration of freedom from a security 
service such as signature might sound rather radical. In practice many conventions for 
abstract description (such as { • } or Sigj^ { • } ) intend digital signatures using the 

private key as the concrete interpretation for the description. Our classification makes 
no such assumption. 

The inclusion of extraneous information when describing authentication protocols 
may sometimes turn out to be a stumbling block on the way to a clear perspective 
with regard to classification work. Classification implies a collection of general 
prototypes, each of which may cover many concrete instance protocols. In this regard, 
classification work seems to naturally require us to identify the essential elements in 
authentication protocols. These elements behave as cryptographic particles, and the 
way that they are combined and used seems to be a good criterion for classification of 
authentication protocols. 

The ultimate goal of entity authentication in its cryptographic context is to check if 
the identity claimant or prover has the relevant private information, namely the 
private key of the principal whose identity has been claimed by the prover. In other 
words, the relevant private key should be used in the current session of an 
authentication protocol and the protocol must provide the verifier a clear indication of 
the application of the private key to particular data. This particular data, of course, 
must have a freshness property to prevent any replay attack. Another important 
property which is desirable in any entity authentication protocol is that the verifier 
needs to be sure about whom the prover is claiming an identity to. 

From this simple observation, we can define cryptographic particles as follows. 

Definition 1. Cryptographic Particles of an authentication protocol include the 
following two types, each of which is subdivided into two sub-types: 

• Identities 

- A, B 

• key values: 

- APriKey. the private key of the principal A whose identity is claimed to the 
verifier 

- APubKey: the public key of the principal A whose identity is claimed to the 
verifier 

• fresh data: 

- forced challenge 

- self challenge 
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The two different types of fresh challenge data need to be more clearly described in 
the following definition. 

Definition 2. Forced Challenge (F), Self Challenge (S) and No Challenge (0): If the 
fresh data is a random nonce generated by the verifier and then delivered as a 
plaintext or a ciphertext to the prover, then we say that the protocol uses a forced 
challenge to authenticate the prover. On the other hand, if the fresh data is generated 
by the prover himself the protocol is said to use self challenge. When there is no 
challenge value exchanged in the protocol, we say that the protocol has no challenge. 

Self challenge values may include sequence parameters and time stamps (both of 
these types will be denoted as TS^ or TSg in this paper) whose timeliness can be 
checked by the verifier. 

Now, taking a look at how the key values may be used in authentication protocols, 
we can observe two different patterns as follows. 

• Application of APubKey by the verifier and then APriKey by the prover 

• Application of APriKey by the prover and then APubKey by the verifier 

Each of these two patterns has to be woven together with fresh data. Hence the 
above two patterns can be developed into more detail as follows. 

• Application of APubKey by the verifier and then APriKey by the prover: 
The following example uses a forced challenge, that is a random nonce chosen by 
the verifier B. 

- 1. A ^ B: APubKey{B, } 

2. A B: Tg 

• Application of APriKey by the prover and then APubKey by the verifier: 
The first of the following examples uses a forced challenge, whereas the second 
uses a self challenge. 

- 1. A B: Tg 

2. A ^ B: APriKey {B, rg } 

- 1. A ^ B: APriKey{B, TS^ ] 

Here APriKey{B, rg } does not necessarily imply a signature transformation but 
simply an application of APriKey to the other cryptographic particles B and rg . It can 
be implemented by {B, rg} 1 ,{h{B, rg )}„_! or some other way that the verifer B 

can validate the authenticity of the result. Likewise, APriKey {B, TS^ } may 
correspond to a more concrete form such as ,{h(B, TS^)}„_i ”. On the other 

hand, APubKey{B, rg } roughly corresponds to any encryption using APubKey from 
which only A can retrieve rg . In essence, we focus only on authentication, not any 
other property like signature or key establishment here. 
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We have so far obtained three different elementary types of authentication 
protocols, but are still far from complete. Sometimes, for instance, we see a protocol 
where there is no explicit response message to its corresponding challenge. Before we 
enter into more comprehensive steps for classification, we introduce some 
terminology to avoid too verbose description of the classification steps. 

Definition 3. Origin Authentication (OA) and Destination Authentication (DA): If a 
protocol contains a message of the form APriKey{»} then we say the protocol 
provides origin authentication of the entity A, whereas if it contains a message of the 
form APub Key {•} then it provides destination authentication of the entity A. 

Definition 4. Implicit Authentication (lA): If a protocol contains no message of the 
form APriKey{»} or APubKey{»], but still requires entity A to compute a value of the 
form APriKeyi •}, then we say that the protocol provides implicit authentication of A. 

One thing to note regarding our new term implicit authentication is that there is a 
similar term implicit key authentication which is a feature of some key establishment 
protocols [11]. These two different usages of the same term are rather orthogonal to 
each other. Furthermore, although there seems to be no explicit definition of implicit 
(entity) authentication in the literature, some authors use the notion implicit in the 
context of entity authentication as well. The following excerpt shows this point [8]. 

B is “explicitly authenticated based on B’s signature transformation. A is 
“implicitly” authenticated based on the use of its public encipherment 
transformation, since only A is able to decipher the key token. 

Hence the existing concept of implicit entity authentication corresponds roughly to 
our destination authentication, and explicit entity authentication to our origin 
authentication. On the other hand, with our definition of implicit authentication, there 
seems to be no correspoinding terminology in the literature. The exact meaning of 
these terms in our context will become clearer when we deal with the final 
classification table in a later section. 



3 Classification Steps 

We now have prepared all the tools we need to carry through the sequential steps for 
classification. Each of the following steps are related to the types of cryptographic 
particles and/or the way they are combined and used in authentication protocols. 

Step I: Identify the type of authentication adoped in the given protocol; is it implicit 
authentication (lA), origin authentication (OA) or Destination Authentication (DA)? 

Step II: Identify the type of challenge values used in the given protocol; is it forced 
challenge (F), self challenge (S) or no challenge (0)? 

At the end of this step, the protocol is identified as one of the seven types: IA 0 , lAp, 
OA 0 , OAs, OAp, DA 0 , DAp. 
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Step III: In the case of the type DAp, namely DA type of authentication with forced 
challenge, is there a subsequent response from the prover to the challenge? According 
to the answer to this question, the protocol is divided into one of the two types: DAp 
NoAck and DAp, Ack- The former stands for the answer “No” and the latter “Yes”. 

Now, we have eight different types of authentication in total, which are summarized 
and described in detail in Tables 1 and 2. 



Table 1. Classification steps of all the possible types authentication 



lA 


IA0 


lAp 




OA0 


OA 


< 

0 




OAp 




DA0 


DA 


DAp 


DAp NoAck 




DAp, Ack 



Table 2. Eight different prototypes of authentication and the corresponding examples 



Authentication type 


Example 


lA 

Implicit 

Authentication 


1A0 


A : ApriKey{ B ) 


lAp 


A <— B : 

A : APriKey { B, r^] 


OA 

Origin 

Authentication 


OA0 


A ^ B : APriKey{ B } 


OAs 


A B : TS^ , APriKeyi B, TS^ } 


OAp 


A B : ’'b 

A ^ B : APriKey { B, } 


DA 

Destination 

Authentication 


DA0 


A <— B : APubKeyi B } 


DAp, NoAck 


A B : APubKey{ B, Tg } 


DAp, Ack 


A <- B : APubKeyi B, rg } 
A ^ B : Tg 



In Table 2, the principal B is the verifier authenticating the claimed identity A, and 
an expression “A: APriKey{ B,r^ }” means that the principal A computes APriKey{B, 
fg } whereas “A ^ B: APriKey{ B, rg }” means that A computes APriKey{B, rg } 
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and then sends it to B. Also note that all the example protocol includes only 
cryptograhic particles: identites, keys, forced challenges and/or self challenges. 

The example protocols for the prototypes are not intended to he perfect in terms of 
security. In fact, some proper modifications will have to be made to these example 
protocols to make them flawless. Furthermore, all prototypes without any challenge at 
all such as IA 0 , OA 0 and DA 0 can never be made secure against replay attacks 
because of their inherent lack of freshness. Also note that the classification table does 
not include the type lAs, namely implicit authentication with self challenge, simply 
because it seems questionable to consider timeliness of any self challenge value or 
time variant parameter which is not delivered at all in the current session of a 
particular authentication protocol. 



4 Mutual Authentication Protocols 

In the previous section we derived all possible prototypes; eight different ones in 
total. Now we consider how many different prototypes of mutual authentication 
protocols there are of each kind. We can easily count them exhaustively as follows. 
First, the answer will be not more than 8^ = 64 because we consider eight different 
types for both direction: authentication of 5 by A and vice versa. Any protocol has an 
initiator entity, which we call A, and a responder entity, B. In order to rule out 
protocols which are merely mirror images of each other, with initiator and responder 
interchanged, we will regard prototypes where B acts as initiator as illegal. Consider 
the following example protocol corresponding to the prototype DAp acr-OAs, which 
means DAp acr for authentication ofB by A, and OAs for authentication of A by B. 

Protocol 1. An example of the mutual authentication prototype DApj^-j^-OAs 

1. A — > B: BPubKeyi A, , TS^ ,APriKey{ B, TSj^ } ) 

2. A B: 

The pattern BPubKey{ , . . . ) from A to B and the subsequent response from 

B to A contributes to destination authentication of B by A, whereas APriKey{ , 

. . . } to origin authentication of A by B. This example is not the only possible one; for 
instance, APriKey{B,TSj ^ , ..., BPubKey{A,rj ^ , ...}} can be considered an example 
of the prototype as well. What about its symmetric counterpart OAs-DAp acr, that is 
OAs for authentication of B by A, and DApacr for authentication of A by B? 
According to this combination of prototypes, we can construct the following example. 

Protocol 2. An example of the mutual authentication prototype OAs-DAp j^^f^, which is an 
illegal case 

1. A <— B: APubKeyl B, rg , TSg , BPriKey{A, TSg } } 

2. A ^ B: Tg 



This protocol violates the condition that A is the initiator of the protocol. It is 
tempting to assume that every possible protocol has a symmetric illegal protocol 
unless both A and B use the same prototype. This would mean that the total number of 
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mutual authentication prototypes should be the eight symmetric cases plus half of the 
remaining 56, or 36 in total. But in fact this is not the case. For example, OAp-DAp^Ack 
and its counterpart DAp.Ack-OAp are both legal combinations as shown below. 

Protocol 3. An example of the mutual authentication prototype 

1. A ^ B: 

2. A <— B: BPriKey{ A, , APubKey{ B, } } 

3. A B: rg 

Protocol 4. An example of the mutual authentication prototype DAf^^i^-OAf 

1. A ^ B: BPubKey{A, } 

2. A ^ — Bi T ’ tg 

3. A ^ B: APriKey{B, rg } 

Although protocols 3 and 4 are symmetric in the sense of the use of prototypes, it 
is clear that they cannot be interchanged just by renaming the entities. In fact, we 
constructed all the examples corresponding to the 64 (= 8^) mutual authentication 
prototypes and then identified 17 prototypes among them to be illegal combinations, 
with 47 (= 64 - 17) legal prototypes finally at our hands. 

In the Appendix, many published protocols are classified according to their 
prototypes. This shows that only a small number of the 47 prototypes have been used 
widely in authentication protocols published in the literature. Many of the missing 
prototypes from the table in the Appendix are minor variations of existing ones. For 
example, the prototype DApAck - lAp is missing, yet can be obtained from the 
prototype DAp Ack - OAp by simply deleting its final message. However, there may 
still be missing prototypes which are of interest to investigate. One example is the 
prototype OAs - OAs as described in Protocol 5 which has the useful property that it 
can be executed in one round of communication since the responder can send his 
message without waiting to receive (or to process) the message from the initiator. 

Protocol 5. Prototype OAs - OAs 

1. A — > B: TSp , APriKeyi B, TSp } 

2. A ^ B: rSg , BPriKeyi A, TS^ } 

Through deriving all the possible prototypes, we also found that the prototype 
DAp Ack-OAp had not been published as any particular instance form inspite of its 
attractive properties for mobile communication security. 

Protocol 6. Prototype DAFp^h-OAf 

1. A ^ B: BPubKeyi A, } 

2. A ^ — Bi Tp , fg 

3. A ^ B: APriKeyi B, rg } 

In Protocol 6, the cryptographic form of the first message is particularly felicitous 
for mobile communication environment as mobile terminals (A in Protocol 6) can 
access the public key of the network (B in Protocol 6) using the system broadcast 
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channel, which is, anyway, necessary for call setup aside from security purpose. The 
third protocol message is compliant with digital signature of relevant messages under 
the user’s private key, APriKey. Hence, this prototype is particularly suitable for 
future mobile communication security, where nonrepudiation of user messages is 
important for electronic payment. We developed this prototype into a more specific 
and secure protocol [3]. 

Of course we can apply more than one prototype to a unilateral authentication; 
hence we can achieve a very complex prototype like OApDApN„Ack - OApDAp NoAck^ 
which means each entity authenticates the other using two prototypes OAp and 
DApNoAck- For our 47 basic prototypes, however, we did not count this kind of 
multiple application of prototypes. This is because, first of all, that kind of excessive 
usage of expensive public-key computation cannot be justified without a sufficiently 
good reason, and secondly any trial to include all the possible multiple application of 
prototypes will end up with infinitely many prototypes! However, it is of interest to 
note that we can find that kind of multiple application of prototypes in the literature; 
we present one example from ISO/IEC key transport mechanisms [8]. 

Protocol 7. A simplified description oflSO/IEC key transport mechanism 5 

1. A ^ B: 

2. A <— B: Ssirs > Tt , A, E^iB, Kg ) ) 

3. A ^ B: SAifA > r-B , Eb(A, Ka) ) 

Session key K^b = one_way_function(Wi, Kg) 

Here Sa and Ea denote A’s private signature transformation and A’s public 
encipherment transformation, respectively. After a simple check, we can identify the 
protocol as of type OApDAp^NoAck - OApDAp ^oAck- Here we counted each secret key of 
Ka and Kg as a random challenge or forced challenge because we can consider a 
secret key as a random value. A possible view of this protocol is that each entity 
authentication comes from a signature transformation and key confidentiality is 
guaranteed by the adoption of public key based encryption. Therefore we may classify 
the protocol as an instance of the prototype OAp-OAp. However, from another 
viewpoint, we may consider that the encryption in the protocol contributes to entity 
authentication as well and there is clearly multiple application of prototypes in either 
direction. When the signature operation is not critical, this protocol will be overkill. 



5 Learning Lessons from Failure 

Past experience has proven that many protocols have turned out to be weak or faulty 
in some way. We believe that these failures can be used to help us avoid repeating the 
same mistakes. Unfortunately, however, current practice regarding protocol design 
and analysis is not promising on this point. We believe that a classification-oriented 
perspective is critically important because it enables us to approach a particular 
authentication protocol with its corresponding prototype in our mind. In this way, we 
are able to reuse lessons learned from our past failure in a concrete protocol belong to 
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the same prototype. First let us take a look at the following example protocol 
belonging to the protype 

Protocol 8. An example protocol of the prototype 

1. A ^ B: BPubKeyi A, } 

2. A <— B: , APubKey{ B, } 

3 . A — ^ B ! Fg 

The cryptographic messages BPubKey{...} and APubKey{...} are meant to 
authenticate their respective destination entities (so this protocol is of the form DA- 
DA), and each of them contains a random nonce as the required (forced) challenge 
(DAp-DAp), to be followed by a corresponding response, thus resuliting in the 
prototype D Ap acr-D Ap acr- This protocol is directly constructed from its 
corresponding prototype, not the other way around. In the construction process, of 
course, we applied a precious lesson learned from failures of existing protocols: 
include the proper identity field in encryption or signature messages [1]. Flence A and 
B are included in the BPubKey{...} and APubKey{...} in the protocol; in fact we 
defined identities as one type of cryptographic particle earlier. This protocol may be 
strengthened even further, thwarting any attack similar to the Canadian Attack [12,5]. 
The original context in which this attack was introduced was a signature based 
authentication protocol corresponding to the prototype O Ap acr-OAp acr, where 
APriKey{»} and BPriKey{»} type messages are used instead of APubKey{»] and 
BPubKey{»}. Despite this difference, Protocol 8 is vulnerable to essentially the same 
attack as shown below. 

Canadian attack applied to Protocol 8 

1. E\A ^ B: BPubKey{A, } 

2. E\A <— B: , APubKey{B, } 

r. A ^ B/E: APubKey{B, } 

2’. A ^ B/E: Fg , BPubKeyi A, F^ } 

3. E\A^B: Fb 

Here the notation E\A ^ B means that E is masquerading A to B. The result of the 
attack is the same as when it is applied to protocols of the prototype OAp acr-OAp_acr: 
the attacker E is accepted as a false identity A to B. Here, the victim entity A is used as 
an oracle by the attacker. This protocol can be fixed by including f^ in APubKey{B, 
Fb } of the second message, resulting in the following protocol. 

Protocol 9. Fixed protocol of the prototype DAf j^ck-DAf A^± 

1. A ^ B: BPubKeyi A, f^ } 

2. A <— B: F^ , APubKeyi B, f^ , Fb } 

3. A B: Fb 

Inclusion of f^ in the encrypted message and the subsequent response 
guarantees to B that A is aware of the f^ which he sent in the first message. Here we 
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can see that the protection against the Canadian attack is not message authentication 
(of here) hut rather a sort of message awareness (of hy A). 

Now we have developed a good generic protocol corresponding to the prototype 
DAp,Ack-D Ap Ack- Using this prototype and its generic protocol, we can analyse other 
published concrete protocols. The first example is the Needham-Schroeder public key 
protocol [14], whose simplified description using our notation is as follows. 

Protocol 1 0. Needham-Schroeder public-key protocol 

A^B:BPubKey{rj^,A } 

2. A <— B: APubKey{ , rg } 

3. A — > B: BPubKey{ } 



It can be easily checked that this protocol belongs to the same prototype as 
Protocol 9. The third message is only for confidentiality of rg for use in secret 
session key establishment. We can achieve more economically the same effect in the 
third message by replacing encryption of rg with hashing of rg . Now let us compare 
this protocol with Protocol 9. The Needham-Schroeder protocol lacks the identity B in 
APubKeyi*} in the second message. This leads to the attack presented by Lowe [9]. 
Another instance of the same prototype is protocol 9, known as Helsinki Protocol 

[13]. 

Protocol 11. Helsinki protocol 

1. A ^ B: BPubKeyi A, K^, r^} 

2. A <— B: APubKeyi Kg , , tg } 

3. A B: rg 

Comparison with Protocol 9 shows that this protocol lacks the identity B in the 
second message, which is exactly the same pattern of weakness in the Needham- 
Schroeder protocol. Lowe’s attack on the Needham-Schroeder appeared in 1995, and 
applies to the Helsinki protocol as well, which was proposed in the same year. 
Nevertheless, this weakness was found again in 1998 by Horng and Hsu [6]. Then, in 
the same year, Mitchell and Venn amended it [13] by including the identity B in the 
encrypted part of the second message of the protocol, which is exactly the same way 
Lowe has fixed the Needham-Schroeder protocol. Of course, Mitchell and Yeun were 
well aware of the same pattern of weakness and the corresponding remedy in both 
protocols. 

In summary, we can see that every concrete protocol of the same prototype suffers 
from the same pattern of weakness, and can be fixed with the same pattern of remedy. 
This is why it is very important for us to identify the corresponding prototype when 
we are given a concrete protocol. We believe that our classification method is simple 
and very helpful for identification of the prototypes of authentication protocols. 
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6 Deriving a Prototype for a Given Protocol 

In the previous sections we have identified the type of a number of particular 
protocols. Sometimes, however, it is not so obvious. The first example protocol is an 
authentication and key establishment protocol proposed for the Universal Mobile 
Telecommunications System by the European ASPeCT project [2]. 

Protocol 12. ASPeCT protocol for authentication and key establishment in UMTS 

A: UMTS user, B: UMTS network 

1. A^B: 

2. A <— B: Tg , ^2( > tg , B ) 

3. A ^ B: { {h^i g^^ , ,B, , A } 

A,B: fC^g = /!i(rg ) 

Notation: 

g : a generator of a finite group 

random numbers chosen by A and B, respectively 
b , g^ : a certified long-term private and public key of B for key agreement 
K^g : session key to be shared between A and B 
{ • } „_i : a signature using A’s private signature key 

{ • } : an encrypted message using the session key X^jg 

hi, h 2 , hy one-way hash functions specified for ASPeCT protocol 
g* : discrete exponentiation modulo a prime 

We first need to identify all the challenges exchanged. We can easily see that rg 
plays the role of forced challenge from B to A. As for the reverse direction, it is not so 
clear, but a little thought shows that that g'^^ is what we are looking for. The 
network B receives g'^^ and then raise it to the power of b which is its own private 
key. This corresponds to the generic form BPriKey{ }. 

Remember that here we wish to ignore specific mathematical forms like g^-^ and 
B(«); instead, we only focus on the essential elements, i.e., cryptographic particles as 
described in Definition 1 of Section 2. In this way we are inevitably led to overlook or 
omit other important properties of given protocols, such as key establishment and 
digital signature. However, our foremost goal of this classification work is to derive 
all possible generic forms of entity authentication. That is why our oversimplification 
may be justified. In this particular protocol, for instance, we can think of g^^' as a 
random challenge from A to B when entity authentication itself is a concern. In other 
words, there is no change with regard to the goal of entity authentication even if we 
replace g^-^ with simply . The specific form like g'^^ is for secure key 
establishemnt, but it is not of any importance to our classification purpose. 

The subsequent transformation of hashing hiir^ ,g^'^'' ) does not affect this basic 
cryptographic property with regard to challenge-response. Focusing only on the 
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cryptographic particles in this way, we can derive the corresponding authentication 
prototype of the protcol: OAp^Ack-OAp^Ack- That is, in this protocol, the user A and the 
network B authenticate each other with the basic prototype origin authentication 
using forced challenge followed by response (see Definition 2 and Definition 3). 

The interesting point to he noted in this protocol is that we do not have to rely on 
signature transformations to obtain protocols of the prototype OAp^Ack-OAp^Ack- The 
message which corresponds to BPriKey{ } does not provide any 

signature property because it can be computed as (g^ by the other party A as well. 
After all, as far as authentication is concerned, we do not consider any compound 
property like digital signature. We can reason that this protocol may have a similar 
strength (and weakness if any) to its related protocols of the same prototype, of which 
the well-known Station-to-Station (STS) protocol is an example. 

Let us investigate the Yocobi-Shmuely protocol [16] as the second example. 

Protocol 13. Yocobi-Shmuely protocol 

1. A — ^ B ! a 

2. A B : b 

A: = 

B: 

Notation: 

a, g^^ : long-term private and public keys of A 

b, g ^ : long-term private and public keys ofB 

, Tg : short-term random nonces chosen by A and B, respectively 
K : new session key to be shared between A and B 

Here, a and b correspond to APriKey and BPriKey, respectively, and and 

g~^ to APubKey and BPubKey. The analysis regarding challenge-response is rather 
tricky in this protocol. Note that anyone can retrieve and from the protocol 
messages. Hence, we reason that, for instance, a-tr^ roughly corresponds to 
APriKeyl g'^'^ } because anyone can retrieve g^'' using APubKey which is a public 
data. We can see that g’^'^ here does not play the role of a forced challenge as we 
defined it, because it is not combined with 5’s private key but with A’s private key. 
This seems to be rather an anomaly; we can say that this protocol does not have any 
challenge values, either forced or self ones. Finally we can derive a generic protocol 
or prototype corresponding to this protocol as follows. 

Protocol 14. Generic form of the Yocobi-Shmuely protocol 

1. A^B : APriKey { } 

2. A <- B : BPriKeyj } 

We can see the protocol exactly corresponds to the OA 0 -OA 0 (see Table 2). The 
concrete protocol now replaced by its generic prototype clearly shows that how strong 
or weak it is with regard to entity authentication. There is no guarantee of freshness 
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in this protocol, which leads to its vulnerability to replay attacks. This weakness and 
attack was published several times, for example by Martin and Mitchell [10]. The 
translation from the concrete version to generic prototype described here is roughly 
the same procedure as that of BAN logic [4]. The only difference is that our process is 
more concerned with prototype derivation. Even though the translation may appear 
trivial its significance is not. We believe that if the Yacobi-Shmuely protocol had 
been known as an instance protocol which belongs to an inherently weak prototype 
OA 0 -OA 0 , then the following modification, [15], to the protocol would not have been 
tried at all. 

Protocol 15. A modified version of the Yocobi-Shmuely protocol 

1. A^B : 

2. A <— B : b +rg 

A: = 

B: 

The motive for the modification is to alleviate the computational burden of discrete 
exponentiation for B (which may be a mobile terminal). However, changing a + to 

g^B ) jjj (-j^g message corresponds to changing APnWey] ) to a non- 
cryptographic message. Nobody except A can generate the message of the 

form APriKey{ ) even though anyone can replay the message. However, the modified 
form allows anyone to generate a new value of g^^^® which is an entirely legal one. 
This fatal weakness and relevant attack appeared in previous work [3]. 



7 Conclusion 

We have developed a classification scheme for authentication protocols using abstract 
fundamental protocol elements. We have demonstrated the practical use of the 
scheme with a number of examples. The classification can be useful in both design 
and analysis of protocols. 

• Protocol analysis can be facilitated by identifying the class in which a particular 
protocol lies and then comparing its security with other protocols in the same 
class whose security is well understood. 

• Protocol design can proceed by identifying the protocol requirements and using 
these to identify the desirable class of protocol to be used in the application. 

Both these uses of the classification technique allow for systematic re-use of 
previous analysis and design experience, which aids in the move towards an 
engineering approach to protocol design and analysis. 
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Appendix: General Prototypes of Authentication and Their 
Corresponding Protocols 



Pr(.)tolype 


Example (generic prolocols) 


Concrete protocols 


lA„-0 


1 . A"--> B ! r 
B: BPnKeyi } 


• ElGainal key agreetneni 

• LSO/IEC key agreemenr mechanism 2 


DA,vr0 


l.A^B:BPubKey{A } 


• TSO/IHC! key transport mechanism 1 


IAfA-iA0 


A; APriKf’yi B ] 
B: BPriKey{ A } 


• I.SO/IEC key agrccmciil iiiechanisiii 1 


IA|.-IA|, 


1. A B: 

2. A ^ B: 

A: APriKayi B, Tjj ) 
B: BPriKeyi A, r^] 


• ISO/IEC key agreernenl mechanism 4 

• ISO/IHC key agreement mechanism 5 

• Goss protocol 

• Maisumoto-Takashima-Imai (MTl) AO 
key agreernenl proioeol 


IA|, OA.s 


] . A ^ B: r _4 . , APHKey{ B, TSj^ } 

B: BPriKeyi 1 


* ISO/IEC key agreement mechanism 3 

• Nyberg Rucppcl key agr'eemeni 
protocol 


OA|.-OA|, 


1. A^B 

2. A^B 

3 . A — > B 


BPriKeyi A, ), Tg 

APrikeyl B, Tg } 


• ISO/IEC key agreement mechanism 7 

• Staiion-lo-Slation protocol 

• ISO/IEC key transport mechanism 5 

• ASPeCT protocol for UMTS 


OAp- 

DAp,NoAck 


1, A ^ B 

2, A^B 
or. 

!. A-^B 
2. A^B 


’’a 

APubKeylB, fg , BPriKeyi -4, ) ) 

^A 

BPriKeyi A, , APubKeylB, Tn ) 1 


• ISO/IEC key transport mechanism 4 

• ISO/IEC key agreement mechanism 6 

• Beiler-Yacobi’s two-pass protocol 


'.MoAck- 

OAs 


1. A ^ B 
or, 

1. A ^ B 


BPuhKcylA, , 'iSpj , APriKeyl B, TSj^ ] [ 

TSj,^,APriKeyl R, TSpj , BPiibKcylA, p, | 1 


* ISO/IEC key transport mechanism 2 

* l.S(7/IHC key iransporl mechanism 3 

* Nonh American PACS public key AKA 
protocol 


L)Ap.,,,k- 

OAp 


1. A ^ B 

2. A <- B 

3. A ^ B 


BPuhKcyi /i, } 

'a - 

APriKcyi r,j } 


* Boyd-Park protocol 1.3] 


> > 


1. A^B 

2. A <- B 


BPuhKeyi A, } 

APubKeyi B. ] 


• SKEMR protocol 


■I^Ar Ack“ 
DA„A.k 


1. A > B 

2. A<-B 

3. A ^ B 


BPuhKeyi A, ) 

APubKeyi }’ ^a 

_lLi 


» ISO/IEC key transport mechanism 6 

• COMSET protocol 

♦ Needham-Schroeder public key protocol 
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Abstract. Fair exchange protocols are a mechanism to ensure that 
items held by two parties are exchanged without one party gaining an 
advantage. Several such protocols have been proposed in recent years. 
We used the Possum animation tool to explore these protocols to ex- 
amine whether they achieve their security goals. Our experiments revea- 
led some new attacks and helped to gain other useful insights into various 
fair exchange protocols. 



1 Introduction 

Protocols designed to provide various security services have been the subject 
of intense research in recent years. It is widely recognised that such protocols 
arc difficult to get right; this seems to be due mainly to the problem of how 
to adequately account for the possible actions of an attacker. In contrast to a 
communications protocol designed to provide reliable data transfer, a security 
protocol is really a set of protocols parametrised by the attacker’s actions and 
each of these (potentially infinite) variations must be correct. 

Use of formal methods to analyse security protocols has been quite succes- 
sful [5]. However, there are still some limitations which apply variously to the 
different techniques. 

— It may not be possible to step through the protocol and examine the 0 of 
various actions. 

— It may not be possible to use a widely established formal language to specify 
the protocol. 

— Considerable training may be required to use the software tools. 

— Search spaces become too big when complex protocols such as electronic 
commerce protocols are examined. 

The purpose of the research described in this paper is to explore an alterna- 
tive use of formal methods for security protocol examination: protocol animation. 
We have chosen a type of protocol that has not been widely examined formally 
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and use many simplifying assumptions that allow a small specification to be 
quickly derived. This allows us to obtain useful results quickly even for complex 
protocols which would be too large to address comfortably with other protocol 
analysis tools. Some of the features of this approach are the following: 

— its use of the Z standard specification technique, which is approaching inter- 
national standardisation. 

— its use of a friendly general tool which is freely available for academic research 
purposes. 

— its suitability for quick and easy animation of protocols to aid understanding 
for designers and users. 

— its demonstrated potential for revealing new protocol attacks and likely im- 
plementation errors. 

In the rest of this section we will explain the class of fair exchange protocols 
which we have chosen for our study and then introduce the Possum animation 
tool which we used. The next section gives an extended description of how we 
specified and animated a specific protocol. This leads to a more generic protocol 
examined in the following section. 



1.1 Fair Exchange Protocols 

Fair exchange protocols apply to two principals: the originator and the recipient, 
which we denote 0 and R respectively. The general idea is that O and R wish 
to exchange items but that each wants to ensure that if it releases its item then 
it is guaranteed to receive the item of the other principal. There are a number 
of applications which require such a protocol. 

Contract signing. 0 and R have agreed on a contract to be signed. Each 
wants to ensure that its signature becomes available to the other only when 
the other’s signature is available too. 

Certified mail. 0 has an electronic mail message that it wishes to send to R 
but wants to ensure that it gets a receipt for the message. R does not want 
to allow O to get a receipt if the message has not been sent. 

Fair payment. 0 wishes to buy some (electronic) goods from R. 0 will send 
payment in exchange for the goods, but neither party wants to send first. 

Early solutions to the fair exchange problem involved simultaneous exchange 
of secrets. These protocols arc not efficient enough for widespread practical use. 
More recently solutions involving a third party as a notary have been proposed [9, 
10]. In order to minimise the involvement of the third party, recent solutions have 
used the so-called optimistic approach in which the third party only becomes 
involved in the event of a dispute between the principals involved in the exchange. 
First proposed by Biirk and Pfitzmann [2], the optimistic approach was used by 
Asokan, Shoup and Waidner [1] to design generic fair exchange protocols. In this 
paper we will only examine protocols using the optimistic approach. 
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One of the fundamental differences between fair exchange protocols and most 
protocols which have been formally analysed (such as key establishment proto- 
cols) is that the protocol principals 0 and R are mutually distrustful. In a key 
establishment protocol each holder of a session key must trust the other hol- 
der(s) not to give that key away and the adversary is a third party who controls 
the communications medium. In fair exchange protocols the adversary of each 
principal is the other principal. Therefore in our specification we shall ignore the 
possibility of an external attacker. This greatly simplifies the specification, but 
is an assumption also made by designers of fair exchange protocols. Although in 
practice confidentiality and integrity of data will be important, these must be 
provided by other mechanisms. 

To our knowledge there have been two previous attempts to use formal tech- 
niques in the examination of fair exchange protocols. Zhou and Gollmann [11] 
briefly considered a logic based approach. An extended analysis by Schneider [7] 
used CSP to examine one specific protocol of Zhou and Gollmann and provided 
(human generated) proofs of security within the model used. Although Schneider 
makes some of the same simplifications as we do (for example considering only 
internal attackers) his analysis is at a much more detailed level. It may be ap- 
propriate therefore to use our technique initially to explore different protocols at 
a high level of abstraction and then use Schneider’s approach to obtain detailed 
proofs once a suitable candidate has been chosen. 

The messages in most fair exchange protocols are protected by digital sig- 
natures to guarantee their origin and non-repudiability. We do not model the 
signatures directly, but use them to make another major simplification to the 
model. Because the recipient of any protocol message can use a digital signature 
to ensure that the message is correctly formed, we will assume that a malicious 
user cannot alter messages or send incorrectly formed messages. Therefore, the 
only attacks available to a malicious principal are to send messages out of the 
correct order. 

Later we will see that scrutinising this assumption allows to find problems 
in some protocols. In general our approach is based on exploration of a very 
simplified version of the protocol together with an informal examination of the 
simplifying assumptions made. Undoubtedly a more detailed model could cap- 
ture more possible attacks than our model. However, we would claim, firstly, that 
the level of detail is the same as that used in the informal analysis in the publis- 
hed papers describing the protocols and, secondly, that a systematic checking of 
the assumptions used in the model can lead to finding attacks at a finer level of 
detail than the model itself. 

1.2 The Possum Animation Tool 

Possum is a software tool developed at the Software Verification Research Cen- 
tre at the University of Queensland [4]. It provides animation of specifications 
written in Sum, which is essentially a markup of the formal language Z [6]. In- 
deed Possum accepts specifications input in Z using styles zed. sty and 

oz . sty and will translate such inputs into Sum notation. Z is a widely used 
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formal language which is approaching international standardisation. It is based 
on first order predicate logic and set theory together with structural components 
known as schemas. 

Possum supports Z conventions for modelling state based systems and al- 
lows for manual as well as script based animations. It also supports a graphical 
front end to enable visual animation of specifications if required. The specifi- 
cations presented in this paper are written in Z although we freely adopt the 
enhancements provided by Sum to simplify state based description. We do not 
believe this will lead to any confusion but the concerned reader should consult 
the appropriate literature [8]. 

2 The Protocol of Gomila and Rotger 

In this section we look in detail at a recently published protocol for certified 
electronic mail by Gomila and Rotger [3]. This is the simplest protocol that we 
examined, having only three messages in the normal exchange protocol. However, 
the overall structure of the protocol is very similar to other protocols and it 
therefore serves to illustrate the assumptions made and the level of abstraction 
used. The purpose of this protocol is for 0 to fairly exchange an electronic mail 
message in return for a receipt from R. li R receives a message from 0 then R 
should obtain, or be able to obtain, a receipt for that message. O should not be 
able to obtain a receipt for any message which was not received by R. 

The protocol is modelled with a global viewpoint which includes the current 
state of the protocol principals. There are three protocol entities: the originator 
0, the recipient R and the trusted third party T. Of these only T is assumed al- 
ways to act correctly. Messages in fair exchange protocols are typically protected 
with digital signatures. We do not model these directly, but implicitly include 
the service they provide by ensuring that only the signing party can generate 
messages in which they are included. This enforces an ordering on certain mes- 
sage combinations. Malicious behaviour is therefore only modelled by allowing 
the principals to engage in the subprotocols in a different order from what they 
should according to the specification. 



2.1 System State 

Z specifications are well suited to state based modelling which we use here. We 
first need to define the items of interest to record in the system state. Later we 
define the operations which update the system state. 

MessageNumber == 0 . . 3 

A message number is used to identify how many messages in the regular 
exchange have been sent. 

STATUS ::= active \ cancelled \ finished 
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Each of the originator (O), recipient (i?) and third party (T) has a certain 
state which can be one of the elements of STATUS. The outcome of interaction 
with T depends on its state. The states of 0 and R are used to decide if the 
protocol is ‘secure’. Players have status finished once they have obtained what 
they want from the protocol; this means either they have all required items from 
the other party or the protocol is cancelled. Honest parties will not engage in 
any exceptional sub-protocols when their state is finished, but malicious O or R 
may continue to do so at any time. 

RECEIPTS ::= message \ cancel-token \ receipt 

The type RECEIPTS defines things that can be received by 0 or R. The 
cancel token is different for O and R but that difference is not modelled. 

state 

MessagesSent : MessageNumber 

Tstate : STATUS 

estate : STATUS 

Rstate : STATUS 

Oreceipts, Rreceipts : ¥ RECEIPTS 



The system state records the status of the principals, what they have re- 
ceived and also how many messages have been sent. This last is required since 
some protocol messages can only be sent after others have been sent. (This or- 
dering includes some implicit assumptions about the cryptographic properties 
of signatures.) 



2.2 Message Operations 

The messages sent in a normal operation of the protocol when both 0 and R 
act honestly (and assuming no communications errors and delays) are shown in 
Table 1. In the first message O sends the mail message to R encrypted with a 
randomly chosen key. In addition O sends the key encrypted with T’s public 
key. The signature of R in the second message constitutes the receipt. Notice 
that although R does not know the contents of M, R is only acknowledging 
receipt of whatever message will be decrypted when using the key which would be 
decrypted by T. In the third message 0 sends the key to decrypt the message. In 
order to verify that the message obtained in message 3 is the same as the message 
acknowledged in message 2, R needs to check that the ciphertext Et{K) includes 
the same key as obtained in message 3. This means that R will also require to 
receive any randomisation information used by O to encrypt K . 

Operations in a state based Z specification replace state elements with dashed 
elements of the same name. The dashed elements are regarded as the next state. 
(In addition operation inputs and outputs may be defined but are not used in 
this specification.) 
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Table 1. Exchange Protocol of Gomila and Rotger 



1.0^7?: Enc{K, M), Et{K), Sigo{Enc{K, M), Et{K)) 

2. O: SigR{Enc{K, M), Et{K)) 

3. R: Ea{K) 



M 


Message to be sent from 0 to R 


K 


Random key chosen by 0 to encrypt M . 


Enc(K , .) 


Encryption with shared key K. 


Ep{.) 


Encryption with public key of principal P. 


Sigp{.) 


Signature by principal P. In this paper all signatures are assumed to be 
signatures with appendix. This means that the message must be available 
in order to verify the signature and the message itself cannot be derived 
from the signature. 



init 

Tstate' = active 
Ostate' = active 
Rstate' = active 
MessagesSent' = 0 
Rreceipts' = 0 
Oreceipts' = 0 



The first operation simply initialises all the elements of the system’s state to 
their starting values. Initially all principals are active, no messages have been 
sent and neither 0 nor R has received anything. We now specify the messages 
sent in a ‘normal’ run of the protocol when both principals act correctly. 



Message 1 

Astate 

MessagesSent = 0 
MessagesSent' = 1 
changes_only{ MessagesS'ent} 



Nothing changes except the message count. It may seem odd that we include 
no details of the message at all in the formal model. The reason is that we 
are interested only in the state changes among the principals and these do not 
change when the first message is sent or received. When we come to animate the 
protocol we implicitly acknowledge the property of the signature used in this 
message by assuming that only O can send this message. 
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Message2 

Astate 

MessagesSent = 1 

MessagesSent' = 2 

Oreceipts' = Oreceipts U {receipt} 

Ostate' = finished 

Rstate' = Rstate 

Rreceipts' = Rreceipts 

Tstate' = Tstate 



An important precondition for success of the operation Message2 is that 
message 1 has been correctly received by R. This implicitly includes the cryp- 
tographic assumption that the signature in message 1 cannot be forged. In the 
animation we will assume that only R can send message 2 because only R can 
generate its signature. Another simplification of our model is that we do not 
differentiate between sending and receiving of messages, but treat the whole 
transfer as an atomic operation. This is reasonable in the absence of an external 
attacker and it is straightforward to check that communications errors will not 
result in an insecure outcome. The originator receives the receipt. Since this is 
all the originator needs from the protocol its state becomes finished. 

Messages 

Astate 

MessagesSent = 2 

MessagesSent' = 3 

Rstate' = finished 

Rreceipts' = Rreceipts U {message} 

Ostate' = Ostate 
Oreceipts' = Oreceipts 
Tstate' = Tstate 



The recipient receives the message (through receipt of the key required to 
decrypt it). Since this is all the recipient needs from the protocol its state beco- 
mes finished. 

2.3 Exceptional Operations 

We now examine the operations that occur when the protocol is not completed as 
expected. There are two sub-protocols that can be invoked: the Cancel operation 
is intended to allow 0 to cancel the transaction if it does not receive message 2 
in the exchange protocol; the Finish protocol is intended to allow R to force the 
protocol to completion if it does not receive message 3 in the exchange protocol. 

Table 2 shows the message exchange in the Cancel sub-protocol. In the model 
this is contained in a single atomic operation; this appears to be a reasonable 




216 C. Boyd and P. Kearney 



simplification since it is an assumption that communications with T are reliable 
and that T will always act honestly. 



Table 2. Cancel operation in Gomila and Rotger Protocol 



1. T ■. Enc{K, M), Et{K), Sigo{Enc{K, M), Et{K)) 

2. if Estate = finished then 
T ^ O: SigR{Ene{K, M), Et{K)) 

else 

T ^ O: (cancelled, Sigo{Ene{K, M), Et{K))) 

Estate' = eancelled 



It is an assumption in this sub-protocol that only 0 is able to invoke the 
Cancel operation. This assumption is reflected in the animation when considering 
the actions available to a malicious principal. For most of the protocols that we 
examined it is difhcult to decide whether this assumption is reasonable, mainly 
because it is not specified how T will decide when two protocol runs are different. 
Consider the Cancel protocol in table 2. It is tempting to argue that the signature 
of 0 prevents any other party from sending a valid first message. But if T is 
not aware of the identity of O, then another party (in particular R) can send 
the Cancel request message. If T simply stores, say, the encrypted message, 
Enc{K , M), as its index to cancelled protocols then it will not know later who 
initiated the Cancel protocol. In the Gomila and Rotger protocol there seems 
to be no benefit for R to initiate the Cancel protocol but we will see below that 
this is not the case in other protocols. 

Cancel 

Astate 

Oreceipts' = 

if Estate = finished then 
Oreceipts U {receipt} 

else 

Oreceipts U {cancel -token} 

Estate' =if Estate = finished then Estate else cancelled 

Ostate' = finished 

Rstate' = Rstate 

Rreceipts' = Rreceipts 

MessagesSent' = MessagesSent 



If the state of E is finished then the request to cancel is overridden and O 
gets a receipt from E instead. Otherwise 0 gets a cancel-token and E moves 
into state cancelled. Note that O can invoke the Cancel protocol even though 
message 1 has not been sent. 
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Table 3. Finish operation in Gomila and Rotger Protocol 



1. T 

2 . 

T ^ R 
T ^ R 



Enc{K, M), Et{K), Sigo{Enc{K, M), Et(K)), SigR(Enc{K, M), Et(K)) 

if Estate = cancelled then 

SipT (cancelled, SigR{Ene{K, M), Et{K))) 

else 

StgT{K) 

Estate' = finished 



Table 3 shows the Finish sub-protocol which can be viewed as a dual of 
the Cancel protocol. It is assumed that only R can initiate the Finish protocol 
although, once again, it is not clear how this is enforced. If the state of T is 
cancelled then R gets a cancel-token from T. Otherwise R gets the message and 
T moves into state finished. The precondition MessagesSent > 1 is an implicit 
assumption that the signature in message 1 cannot be forged by R. 

Finish 

Astate 

MessagesSent > 1 
Rreceipts' = 

if Estate = cancelled then 

Rreceipts U {cancel -token} 

else 

Rreceipts U {message} 

Estate' =if Estate = cancelled then Estate else finished 

Rstate' = finished 

Ostate' = Ostate 

Oreceipts' = Oreceipts 

MessagesSent' = MessagesSent 



The final schema in the specification is an expression of what it means for 
the protocol to be secure. Although this is presented as an operation there are 
no changes to the state and we are interested only if the schema is true or false. 
This can be checked at any stage during the protocol animation. The operation 
has value true if, when one party has what it wants and the other has finished, 
then the other party also has what it wants. 

Secure 

state 

state' 

[Ostate = finished A message e Rreceipts) receipt G Oreceipts 
[Rstate = finished A receipt G Oreceipts) =t- message G Rreceipts 

changes_only { } 
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2.4 Results of Animation 

The protocol of Gomila and Rotger was animated using the Possum tool to 
investigate the operation and security of the protocol. The general procedure 
used was as follows. 

1. Assume that party 0 is malicious and party R is honest. 

2. Initialise the protocol. 

3. Methodically search the tree of possible operations by 0 and animate those 
actions followed by the response of the honest R. 

4. Check that the Secure operation remains true at each step. 

This procedure can then be repeated with the roles of 0 and R reversed. 

The animation search quickly found an attack on the protocol which allows 
0 to obtain a receipt for a mail message which was never received by R. The 
malicious 0 runs the Cancel operation at any time (for example right at the 
start of the protocol run) and then proceeds with the exchange protocol but 
refuses to send message 3. When R attempts to run the Finish operation only 
a cancel-token is obtained since T believes the protocol to have been cancelled. 
The transcript of a Possum session showing the attack is given in appendix A. 



3 The Protocols of Asokan, Shoup, and Waidner 

Asokan, Shoup and Waidner [1] published a generic protocol for fair exchange 
together with specific variants for applications such as contract signing, certified 
mail, and payment in exchange for receipt. The general format of the protocols 
is the same as that described above for the protocol of Gomila and Rotger: there 
is an exchange protocol which runs in the ‘normal’ case, and two sub-protocols 
designed for use in exceptional situations. The main structural difference is that 
their exchange protocols include a commitment by both participants before the 
exchange takes place. This means that their exchange protocol contains at least 
four messages, and even five when further assurances of non-repudiation of re- 
ceipt and delivery are required. (It was on the grounds of greater efficiency in 
the number of exchanged messages that Gomila and Rotger claimed that their 
protocol was an improvement.) 

Because of its greater complexity the general protocol is more interesting 
to analyse through animation. We explored both the generic protocol and the 
simplified variants; here we will discuss only the generic protocol since it contains 
all the features relevant to the analysis of the variants. 

The generic protocol is designed for exchange of general forwardable items, 
which are any items that T can send on to 0 or R. (This may exclude certain 
items like electronic coins which must be spent with an interactive protocol, 
but includes contracts, messages and signed commitments to payment.) O and 
R have items, itemO and itemR respectively, which they wish to exchange fai- 
rly. The protocol also includes exchange of signed non-repudiation tokens: a 
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non-repudiation of origin token indicates the origin of each item and a non- 
repudiation of receipt acknowledges receipt of each item. Thus at the end of a 
successful protocol run O should obtain the following three things: 

— the item itemR; 

— the token nroR indicating that R was the originator of itemR; and 

— the token nrrR indicating that R has received itemO. 

R should also have the corresponding things with roles reversed. The general 
structure of the protocol is that in messages 1 and 2 O and R make respective 
commitments to their items to be exchanged. In message 3 0 sends itemO and 
nroO to R. In message 4, R sends all three of itemR, nroR and nrrR. Finally in 
message 5 0 completes the protocol by sending nrrO. The Abort sub-protocol is 
invoked by 0 after message 1 if message 2 is not received within some reasonable 
time. The other sub-protocol is the Resolve protocol which may be invoked by 
either O or i? if a subsequent expected message is not received. 



3.1 Attack of Zhou, Deng, and Bao 

Recently Zhou, Deng and Bao [12] found attacks on the certified mail protocol 
of Asokan et al. which also extend to the generic protocol. They identified that 
the problem was with the Abort sub-protocol which performs the same role as 
the Cancel sub-protocol in Gomila and Rotger’s protocol. It is possible for either 
0 or R to cheat. 

O can cheat by first performing the Resolve operation after receiving message 
2 and then invoking Abort. Once the Abort protocol has completed, when R 
invokes the Resolve protocol he is unable to obtain anything from T except 
the abort— token. R may also cheat by invoking Resolve protocol after receiving 
message 1 which results in him obtaining all his required items. However, since 
message 2 was never delivered, R is unable even to invoke the Resolve protocol. 
Evidently the Abort protocol is also useless at this stage and so the outcome is 
unfair to 0. 

We used the animation of our protocol specification to observe these protocol 
problems, illustrating that the method could have been used to reveal these flaws. 
Zhou et al. proposed a modified Abort protocol which avoids the attacks they 
discovered. We confirmed this in the animation of the certified mail variant and 
also applied the same Abort protocol to the generic protocol. 



3.2 New Problems 

Two further issues were discovered through use of the animation. 



Who Can Run the Abort operation? When exploring the generic ASW 
protocol we noticed that it is possible to get to an insecure state if O would 
run the Abort protocol early (say before the exchange protocol starts) and then 
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continues with the exchange protocol. By refusing to continue after receiving 
message 3, R will obtain itemO and nroO but 0 will then be unable to run 
the Resolve protocol. Of course O should never continue with the protocol after 
running the Abort operation so it would be wrong to claim that this is an attack. 
But there are two possible situations where this could become an attack. 

1. An implementation must decide on a timeout when waiting for message 2 to 
be returned before aborting the protocol. A careless implementation could 
allow the protocol to proceed if message 2 was received before the Abort 
protocol was completed, rather than before it was started. This would allow 
the above to become a real attack. 

2. An implementation must decide how T records which transactions are abor- 
ted. One natural index of a transaction would be the descriptions of the items 
to be exchanged. But this would allow the recipient to invoke the Abort ope- 
ration since R knows all the required parameters and can sign the Abort 
request with its own private key. Then the above feature would again be a 
real attack. Instead T must record the claimed originator and recipient for 
each transaction and ensure that protocol instances with different originator 
or recipient are treated distinctly even if the items exchanged are identical. 

We believe that the above possibilities illustrate the usefulness of the protocol 
animation in identifying potential pitfalls for implementors. 

Unfair Exchange of Non-Repudiation Tokens. The animation showed that 
there is an attack which allows 0 to receive all its items, while R is missing the 
nrrO to show that O received itemR. This attack applies whether or not the 
fixed Abort protocol of Zhou et al. is used. To make the attack work 0 runs the 
Abort protocol at any time (for example before the exchange begins) and then 
completes the exchange protocol up to the receipt of message 4, but refusing 
to send message 5. When R runs the Resolve operation it will obtain only an 
abort-token and not an affidavit which would allow it to go to a judge outside 
the protocol. 

It is interesting to compare this attack with the above observation concerning 
who may run the Abort operation since it still applies here. The attack succeeds 
for OUR acts honestly (and there is no breakdown in communications). But 
if R is malicious, or suspects that O is cheating, then it can run Resolve after 
receiving Messaged, and if it receives an abort-token then can quit thereby 
disadvantaging 0. 

In order to overcome this attack it seems necessary to have a different Resolve 
protocol to run if Messaged was not received by R. If R shows that it already 
received itemO but T knows that O had previously invoked the Abort operation, 
then T should issue an affidavit that R completed the exchange protocol. 

3.3 Zhou, Deng, and Bao Protocol 

Zhou et al. [12] also proposed a new fair exchange protocol for certified electronic 
mail. It is designed to have several improvements over that of Asokan et al., for 
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example with regard to efficiencies. At the level of abstraction of our specification 
there is essentially no difference between their new protocol and the protocol of 
Asokan et al. with the new Abort protocol of Zhou et al. 

One of the simplifying assumptions we have been making is that malicious 
principals can only send correctly formed messages. In general this is guaranteed 
by the digital signatures used. However, there is a signature field in Messagel of 
the new protocol that cannot be checked by R. The message has the following 
format. 

R, H{M, K),EncK{M), Et{K), Sigo{R, H{M, K),EncK(M)), subx 

Here subx = Sigo{R, H{M , K), K , T , Sigo{R, H{M , K), EncxiMj)) and since 
R does not know K at this stage it cannot check this signature. This leads to 
the question of what might happen if this field was incorrect (for example O 
could just send a random string instead of the correct subK)- The only use of 
subx for R is in the Resolve operation. If R sends the wrong subK held then the 
Resolve operation will fail unless O has already resolved. This means that the 
new protocol does not allow R to complete the exchange unilaterally but must 
wait to see whether 0 will resolve. This is a disadvantage which the protocol of 
Asokan et al. was designed to overcome. 

A way to avoid this problem is to have subx included in the first signature 
of message 1, so that message 1 becomes: 

R, H{M, K), EncK{M),ET{K), Sigo{R, H{M, K), EncK{M), subK), subx 

Then when T checks message 1 in the Resolve protocol it can decide whether O 
was cheating or whether R has tried to use a false subK- In the former case the 
protocol should be aborted. 

Of course the above problem cannot be found with our animation. But it 
is an example of a potential problem that can be identified through careful 
examination of the assumptions made in the protocol specification. 



4 Conclusion 

We have shown that animation is a useful process for designers and users of 
security protocols, particularly where complex protocols are used with a number 
of optional steps. Even a very simplified specification has been shown to be 
useful in finding attacks and critical examination of the assumptions made in 
the specification have been shown to give further insights into other problems 
even when these problems are not evident in the animation itself. 

Possum supports techniques to automate the search for insecure states. Re- 
peated application of our technique to a variety of protocols would certainly 
benefit from such automation. Another obvious extension is to include proofs 
of security of the simplified protocols in the case of honest users. To achieve 
this the actions of an honest principal must be specified, which is easily done 
by ensuring that only the prescribed sequence of operations are completed by 
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the principal concerned. However, security proofs are not really the object of 
our approach and it may be questioned what is the value of a proof of such a 
simplified protocol version. 

The situation in which protocol participants have different goals and mistrust 
each other is typical of electronic commerce protocols. It would be interesting 
to apply the approach described in this paper to electronic payment protocols 
which use a trusted third party to resolve dispute situations. 
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A Possum Session for Protocol of Gomila and Rotger 



1 sum: file "gr.sum" 

Yes 

2 sum: param currentmodule GRprotocol 

Yes 

3 GRprotocol: init 

[MessagesSent ’ := 0, Qreceipts’ := ■[}, Qstate’ := active, 

Rreceipts’ := {]■, Rstate’ := active, Tstate’ := active] 

4 GRprotocol: Cancel 

[MessagesSent := 0, Qreceipts := {}, Qstate := active, Rreceipts := {}, 
Rstate := active, Tstate := active, MessagesSent’ := 0, 

Qreceipts’ := {cancel_token} , Qstate’ := finished, Rreceipts’ := {]■, 
Rstate’ := active, Tstate’ := cancelled] 

5 GRprotocol: Messagel 

[MessagesSent := 0, Qreceipts := {cancel_token]- , Qstate := finished, 
Rreceipts := {}, Rstate := active, Tstate := cancelled, 

MessagesSent’ := 1, Qreceipts’ := -[cancel_token} , Qstate’ := finished, 
Rreceipts’ := {]■, Rstate’ := active, Tstate’ := cancelled] 

6 GRprotocol: Message2 

[MessagesSent := 1, Qreceipts := {cancel_token]- , Qstate := finished, 
Rreceipts := {}, Rstate := active, Tstate := cancelled, 

MessagesSent’ := 2, Qreceipts’ := -[cancel_token, receipt]-, 

Qstate’ := finished, Rreceipts’ := {]-, Rstate’ := active, 

Tstate’ := cancelled] 

7 GRprotocol: Finish 

[MessagesSent := 2, Qreceipts := {cancel_token, receipt}, 

Qstate := finished, Rreceipts := ■[}, Rstate := active, 

Tstate := cancelled, MessagesSent’ := 2, 

Qreceipts’ := {cancel_token, receipt}, Qstate’ := finished, 

Rreceipts’ := {cancel_token} , Rstate’ := finished, Tstate’ := cancelled] 

8 GRprotocol: Secure 

no solution 

8 GRprotocol: state 

[MessagesSent := 2, Qreceipts := {cancel_token, receipt}, 

Qstate := finished, Rreceipts := -[cancel_token} , Rstate := finished, 
Tstate := cancelled] 
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Abstract. In Key Predistribution Scheme{KPS) and One-Time use 
Broadcast Encryption S'cftemes(OTBES) a Trusted Authority(TA) ge- 
nerates secret information and distributes part of it to users. It also 
has to observe users’ dishonest acts to prevent collusion attacks. Howe- 
ver, these tasks can be heavy for a TA if there are a large number of 
users in the system. In this paper, we propose Hierarchically Structured 
KPS(HS-KPS) as an effective solution for this problem. Then we evaluate 
its performance in terms of efficiency, security and memory size. 



1 Introduction 

Research for non-interactive key distribution schemes, which are called Key Pre- 
distribution Systems (KPS) [1], is very important and interesting topics in cryp- 
tography for their high security and various uses. By using these techniques, 
we can easily set up conference key distribution systems and broadcast encryp- 
tion systems as well as key sharing between two entities. Especially, it is a well 
known fact that for broadcast encryption systems KPS is a significantly effec- 
tive primitive and that KPS-applied broadcast encryption systems are called 
One-Time use Broadcast Encryption Schemes (OTBES) [8]. In KPS and OT- 
BES, once the users’ secrets are given to their appropriate owners by the trusted 
authority (TA) over secure channels, no interaction among users are required 
to share their communication keys. However, although KPS has many desirable 
properties, it could be a heavy task for a TA to give users’ secrets to all users 
in a secure way. Eurthermore, in order to deal with collusion attacks, a TA has 
to always watch suspicious behaviours of all users. In fact, we have almost the 
same problem in public-key infrastructures (PKI). Namely, in PKI with certi- 
ficate authorities (CA), the root certificate authority takes considerably heavy 
tasks if it has to produce certificates for all users’ public keys and manage revo- 
cation of keys. Therefore, in established PKI in order to deal with this problem 
multiple CAs are set up hierarchically. Although constructing hierarchical TAs 
can also be a practical solution to the problem in KPS and OTBES, it is difficult 
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to realize it efficiently. Namely, since the properties of PKI and KPS are signi- 
ficantly different from each other, the method applied to CA cannot be utilized 
to TA straightforwardly. In this paper, we show a practical method to construct 
hierarchical TAs and evaluate its performance. By employing our scheme, each 
TA has only to pass its lower entities’ secrets and watch their illegal behaviors. 
Moreover, in our scheme the required memory size for users are almost same as 
the conventional KPS and OTBES. 

In the following subsections, we give a brief review of KPS and OTBES and 
point out the problem with them in more detail. Following them, we also briefly 
mention the result of our research. 

1.1 Related Work 

The first particular scheme which belongs to non-interactive key distribution 
schemes was presented by Blom [2]. Matsumoto and Imai [1] generalized non- 
interactive key distribution schemes and named them Key P redistribution Sy- 
stems (KPS). Based on this generalization, various realizations of KPS have been 
proposed [5,6,7,3,4,8,9,10]. In KPS, no interaction is required in advance to share 
communication keys. Namely, in order to share a key a participant should only 
input its communication partner’s identifier to his secret function. [3,4,11,12] 
showed lower bounds on memory size of users’ secret algorithms. Moreover, [8, 
12] applied KPS to a broadcast encryption system. In these broadcast encryption 
systems, only privileged users can compute the decryption key for the encrypted 
data, and no other users can obtain any information about the key. This key dis- 
tribution system is easily realized by KPS. Namely, by employing KPS privileged 
users share a key among them, and a contents distributer encrypts broadcasting 
data with the key. Therefore, only privileged users can decrypt broadcasted data. 
This broadcast-encryption scheme is called One-Time use Broadcast Encryption 
Scheme (OTBES) [8,4,13] and regarded as one of the most practical solutions 
to secure contents distribution. KPS and OTBES are well surveyed in [14]. In 
KPS and OTBES, since a trusted authority (TA) has to give all users’ secrets to 
their appropriate owners, this pre- distribution procedure is a significantly heavy 
task for TA. In the following subsection, we show this problem in more detail. 

1.2 Problems and Requirements on Implementing KPS and OTBES 

As described in the previous subsection, KPS and OTBES require heavy tasks 
from the Trusted Authority{TA): managing and authorizing users, distributing 
secret information of each user in secure manner, managing revocation of keys, 
and watching users’ dishonest behaviours, e.g. collusion attacks. Therefore, the 
TA’s tasks can be considerably heavy if the number of users is large. As already 
mentioned, we have almost the same problems for CAs in PKI. Namely, if the 
root CA has to produce all certificates for all users, the task for it can be consi- 
derably heavy. Suppose the only one CA, which is located in a certain country, 
has to manage a billion users in the world. In this case, the CA needs to so- 
mehow authorize an extremely huge number of users. Furthermore, since most 
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of the users are in other countries, each authorizing requires very complicated 
procedures. In order to avoid this problem, CAs are often constructed hierarchi- 
cally. In such hierarchical CAs, each CA authorizes its lower entities (users or 
more local CAs). By using this technique, although only the root CA is trusted, 
the number of entities which need to be authorized by the root CA can be very 
small. This hierarchical construction is considered to be effective also for TA in 
KPS or OTBES. However, it is difficult to implement it for KPS and OTBES 
primarily due to the following reasons. Namely, in conventional KPS and OT- 
BES only the root TA can produce users’ secrets. A straightforward modification 
for this problem is to give the root TA’s secret to all other TAs. However, in 
this scheme if a TA’s secret is exposed, the whole system is broken immediately. 
Furthermore, the required memory size for each entity can be significantly large. 
Moreover, since entities can collude with any other users, each TA has to watch 
not only his lower entities’ dishonest behaviours but also other entities’. There- 
fore, in this paper we show how to implement hierarchical constructions in KPS 
and OTBES efficiently. Regarding hierarchical constructions of TAs, Blundo et 
al. [3,15] discussed a slight hint of hierarchical TAs. However, they did not show 
concrete implementations and evaluate the performance in detail. 

Here, we summarize the desirable properties for constructing hierarchical 
TAs. 

Property(l) Each TA distributes secrets to its one-level lower entities. 
Property(2) Each TA watches dishonest acts among its one-level lower entities. 
Property(3) If a TA’s secret is somehow leaked, only its lower entities secrets 

are exposed to danger. Other entities’ secrets are still kept secure. 
Property(4) The required memory size for a TA is not unnecessarily large. 
Property(5) The required memory size for a user is not significantly larger 

than that in conventional KPS or OTBES. 

We illustrate the hierarchical construction of TAs in Fig. 1. 

1.3 Our Results 

In this paper, a hierarchical construction of TAs for KPS and OTBES is shown. 
This construction fulfills all of the requirements mentioned in the previous subs- 
ection. Namely, in our proposed scheme each TA distributes secrets to only its 
one-level lower entities and watch dishonest behaviours among its one-level lower 
entities. Furthermore, if a TA’s secret is somehow leaked, only its lower entities 
secrets are exposed to danger. Other entities’ secrets are still kept secure. Moreo- 
ver, in our scheme the required memory size for a TA is signihcantly less than 
that in the straightforward modihcation which is mentioned in the previous subs- 
ection. The amount of memory for users is almost same as that in the optimal 
KPS and OTBES. 

The organization of this paper is as follows; In Section 2, we give an overview 
of KPS and OTBES. Section 3 shows an efficient construction of hierarchical 
TAs in KPS and OTBES. Section 4 evaluates the performance of our scheme. In 
Section 5, an implementation of our scheme is shown. Finally, Section 6 points 
out certain concluding remarks. 
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2 KPS and Broadcast Encryption 

2.1 A Brief Review of KPS and OTBES 

This subsection gives a brief overview of KPS and OTBES. In KPS and OTBES, 
we set up Trusted Authority(TA) , and it provides secret information for each user. 
By using the secret information, users can non-interactively obtain common keys 
(in KPS) or session key for decrypting the broadcasted contents (in OTBES). 

In KPS, when a user Ui communicates with Uj, Ui can obtain the common 
key between Ui and Uj by using his given secret information and Uj’s identifier. 
No information about this key can be obtained by other users. KPS can be also 
utilized for conference-key distribution. In KPS for conference-key distribution, 
Ui obtains a conference key by using his secret information and his partners’ 
identifiers. 

In OTBES the broadcasted contents can be decrypted by only privileged 
users. OTBES can be realized by employing KPS (for conference key distribu- 
tion). Namely, By using the conference key among the privileged users in KPS 
as the decryption key, only these users can decrypt the broadcasted contents. 

In the following subsections, we show properties of KPS and OTBES in more 
detail. 

2.2 Key Predistribution Scheme 

Let 2^ denotes the set of all subsets of the set of users. P C 2^ will denote the 
collection of all privileged subsets to which the TA distributes keys. iF C 2^ will 
denote the collection of all possible coalitions (forbidden subsets) against which 
each key is to remain secure. 

Once the secret information is distributed, each user in a privileged set P 
should be able to compute the key kp associated with P, while no forbidden set 
F e P \P\ < iv disjoint from P eV should be able to compute any information 
about kp. Let Kp he the set of possible keys associated with P. We assume that 
Kp = K for each P e P 

For \P\ = 2, Blom [2] showed the first particular realization of KPS. Later, 
this scheme was generalized by Matsumoto and Imai [1]. Following their concept, 
for \P\ = t [3] showed a concrete realization of KPS as follows, the TA chooses 
a random symmetric polynomial in t variables over GF{q) in which the degree 
of any variable is at most uj, that is, a polynomial 

to to 

F{xi,...,xt) = ...'^ai,...iXi ...xl\ ( 1 ) 

ii— 0 it—0 

where = a,r(ii...it) for any permutation tt. The TA computes Ui = F{ui, X 2 , 

■ ■■,Xt) and gives it to user Ui in a secure manner. In this case, the key kp £ Kp 
associated with the t-subset P = {ui, ..., Ut} is kp = F{ui, ..., ut). In this scheme, 
\Kp\ = q = \K\ and 

log|C/.|= 
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This scheme is optimal in terms of memory size for users due to the following 
proposition. 



Proposition 1 ([3]) For |P| = t and If’] < oj, \Ui\ holds 

iogif/.i > (2) 

In particular, in case of t = 2, Eq.(2) becomes 

log|[/,| = (w + l)log|i^|. (3) 



2.3 Broadcast Encryption 

Here, V C 2^ will denote the collection of all privileged subsets to which the 
TA might want to broadcast a message. !F C 2^ will denote the collection of all 
possible coalitions (forbidden subsets) against which the message is to remain 
secure. 

In OTBES, suppose that the TA wants to broadcast a message to a given 
privileged set P G P at a later time. Note that the TA determines P after 
distributing each users’ secret information. Let Mp denote the set of possible 
messages that TA might broadcast to P. 

As already mentioned, it is a known fact that OTBES is easily realized by 
using KPS [14,12]. Namely, by broadcasting bp := kp + mp, only users who 
belong to P can decrypt mp, where kp is the common key among P in KPS. 



3 Hierarchically Structured KPS 

In this section, we propose Hierarchically Structured KPS (HS-KPS) which have 
all of properties listed in (1.2). In HS-KPS, TAs are set up hierarchically (see 
Eig.l). Each TA receives secret information from upper TA and distributes part 
of its secret information to lower TAs. Users locates at the bottom of hierarchy. 
Once a user receives secret information, it can compute a conference key of any 
t-groups with no interaction to the members of the group. 



3.1 Notations 

In this section we prepare several terms and variables used in the following 
sections. 
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Fig. 1. Hierarchical construction of TAs. 



level Layer level of HS-KPS. 

TA ‘TA’ indicates a trusted authority, 

user End users in the system. 

Entity ‘Entity’ indicates both TA and user, 

parent / child If two entities are directly connected at different le- 
vels, the upper is called ‘parent’, the lower called 
‘child’. 

domain ‘Domain’ is a group consisting of one parent entity 
and all of its child entities. 

bottom TA ‘Bottom TA’ is a parent entity of users. 

root TA ‘Root TA’ is a TA locating at the top of the hierarchy. 

I indicates height of the hierarchy. Root TA locates at level-/, users locate at 
level-0. I is fixed in this paper. TA at level-m is described as TA^. E^ indicates 
TA„ or user i. Because a user locates at level-0, Eq indicates a user. Note that 
in the following arguments E^ can be always replaced with TA^ or user i. If 
Em locates under Em+i, we write 

Em-l-l ^ E m ■ 

TAm(italic) and //^(italic) are ID numbers assigned to TA^ and user i, respec- 
tively. Em (italic) indicates a ID number assigned to Em. C/j is a vector of ID 
numbers assigned to user i. 



Ui = {TAi^i,... ,TAuUi) 
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This case, the following relation must be satisfied, 

TA/^i — ^ TA;_2 TAi ^ i. 

FrAmJ and Fi are functions owned by TA^, and user i, respectively. 
These functions are secret imformation. t is a number of users that share a 
conference key. is a key shared among users A, . . . , it- u>m is threshold 

for collusion attack at level-m. E^’s secret is leaked if ujm + 1 of its child entities 
collaborate. 

Remark 1 In HS-KPS, only bottom TA has more than + 1 child entities 
(=users). The other TAs have the same number of child entities as ojm so that 
collusion attack of TAs is impossible. 



3.2 Implementation 



Key distribution phase. First, root TA generates a symmetric function for 
{xi,j,X2,j, . . . , Xtj) yj 0 < j < I - 1. 

■ ■ ■ ,Xifi, ... ,Xty-l,Xty^2,---,Xtfi) 



= .E • 



•E< 

0 









( 4 ) 



where o hs-s the same value if indices of the same level are exchanged. 

For example. 



,«£,o ,H,o’ ^ ^ 

Next, root TA inputs TAi_\ in and distributes the function to TA/_i. 

Thus, each TA/_i receives 

FtAi^i = F(T Ai-i,Xiy-2, ■ ■ ■ ,Xtfi) 

Similarly each TA;_i input TA /_2 in Xii_ 2 , and distributes the function to 

TA,_2 



FtAi_ 2 — F{T Ai-i,T Ai-2,xij-s , . . . , xt^o) 

Finally, user i has 

Fi = F{TAi-i, . . . ,TAi, Ui, X2y-i, ■ ■ ■ ,Xt^) 

= F{Ui,X2,i-i,... ,xtft) (6) 

Key generation phase. To exchange a key among users A, . . . ,it, user ii com- 
putes 

h,„..,u = FiUi,,U.,,... ,U.,). (7) 



Note that every users knows Ui for any user i. 
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4 Evaluation 



In this section we evaluate HS-KPS in terms of key sharing, security and memory 
size by proving it meets all of properties listed in (1.2). 



4.1 Key Sharing 

,it) for permutation tt because the function E is symme- 
trical (see Eq.(4) and Eq.(5)). So, following the procedures shown in (3.2) users 
can share one unique key among the members of the conference, and each TA 
distributes its secret only to its child entities(Property(l)). 



4.2 Security 

In this section, we discuss the security of HS-KPS and prove HS-KPS meets 
Property(2) and (3). For every entities, secret information means the coefficients 
of the polynomial function. 

Em_i_i holds the function 

Um ^0 

• E ■ ■ ■ <0 • (8) 

ii,m=0 it, 0=0 

When Em+i generates a function to give it to a child entity, it inputs a child 
entity’s ID to 



E • • • E • • • < 

ii,m=0 it, 0=0 
(■Om-1 U>0 

= E -E 

U,T7i-i— 0 it,o—0 



H,0 

0 • 



,^t,o 



(9) 



The coefficient of Fe^ is written as. 



+ ®w„.il,™-l...if,o^l.m- (10) 

For simplicity, we consider the situation that Em^ , • • ■ , Em"*^ (=attackers) 
try to calculate a coefficient of (=target) which is written by Eq.(lO). 

Let Ai, . . . , and X be the random variable induced by it o’ ' ’ ' > 

/t(aJm-i-i) respectively. Let S be the random variable induced 

by «On,„_i...tt.o- 




232 



D. Nojiri, G. Hanaoka, and H. Imai 



Even if , • • • , Em"*^ collaborate, one of the coefficients of can not 

be determined because Eq.(lO) is a polynomial of order u>m + 1, that is, 

H{S\Ai...A^J = H{S). 

Therefore, 

HiX\A,...A^J = H{X), (11) 

because one of the coefficients of is not determined. 

From Eq.(ll) we know, 

— A user’s secret is protected from collusion attack of less than ujm users in 
the same domain. 

— Collusion attacks of users in different domains has no influence on a user’s 
secret because users’ secrets in different domains are replaced with the infor- 
mation of their parent TAs^ and parent TAs’ collusion attacks are impossible 
(see remark 1). 

Therefore, each TA has only to watch its child entities’ dishonest acts (Pro- 
perty (2)). And it is obvious even if TA’s secrets are leaked, only its child entities’ 
secrets are exposed to danger and any entities’ secrets in other hierarchies are 
kept safe (Property(3)). 



4.3 Memory Size 

In this section, we evaluate the required memory size for HS-KPS by comparing 
with the conventional KPS. 

To accommodate the same number of users and keep the same security level, 

• • • Wo = W, (12) 

where w is threshold of collusion attack of the conventional KPS and s^ is a 
security parameter that is 0 < s < 1. Note that this equation is satisfied under 
the assumption that all bottom TAs have the same number of users. 

First, we estimate required memory size for root TA. From Eq.(4), required 
memory size for root TA is, 

^ log 1X1. (conventional KPS) (13) 

^ Knowing secret of a parent entity is sufficient condition for knowing secrets of all its 
child entities. 

^ We introduce this parameter because even if the proportion of collusion attack thres- 
hold to the number of users is the same between conventional KPS and HS-KPS, 
10,000 users’ collaboration seems more difficult than 1,000 users’ collaboration (see 

5. Example). 



fu} 1 

\ t 
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w/- 



-1 + t 
t 



w;-2 + t 
t 



uiq i 
t 



log \K\ (HS-KPS) 



(14) 



In the conventional KPS 



t 



— (w + l)(w + t — 1) • • • (w + l)/l! 



— ■ Lo\_ 



\) 



In HS-KPS, 



— ( ^1 — 2 ^ 
t J V t 



U)q + t 
t 



(15) 



{uji-i + -|- t — 1) • • • (o;;_i -l- l)/i! 

•(w/_2 + {U)i^2 + 1)A! 



•(wq + ^)(^o -l- ^ — 1) • • • (wq + 1)/^! 

= ■ U}\_2 • • • t^o) (16) 

From Eq.(15) and Eq.(16), we suggest that required memory size for root TA is 
approximately as large as that of the conventional KPS. 



Next, we estimate required memory size for each user. From Eq.(4) and 
Eq.(6), required memory size for each user is. 



f uj 1 — lA 

V ^-1 ) 



log\K\. 



(conventional KPS) 



(17) 



f W/-1 + 1 
[ t-1 



‘)("1-‘i 






(HS-KPS) (18) 



In the conventional KPS, 



UJ 1 — l\ 
, ^-1 ) 



(^uj 1 — 1) {io 1 — 2') ••• (^io 1) / (t — 1)! 

.t— 1 .t—i .i— 1\ 

' ^1-2 * * * ^0 ) 



(19) 
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In HS-KPS, 

/ t — l\ /w; 

V ^-1 A 
= {>^ 1-1 + 1 ■ 

{lOI- 2 + t ■ 

(wq + t — l)(wo + t — 2) ■ ■ ■ (cjo + l)/(t — 1)! 

= ( 20 ) 

From Eq.(19) and Eq.(20), we suggest that required memory size for each user 
is approximately as large as that of the conventional KPS. 

From the argument in this section we conclude that HS-KPS meets Pro- 
perty (4) and Property (5). 

5 Example 

In this section, we see advantages of our scheme using a rough example. 

Suppose a global communication system which consists of 100,000,000 users 
spreading over 10 countries. If this system is implemented by a conventional 
KPS, we have the following problems; 1)TA has to give secrets to 100,000,000 
users in 10 countries in a secure manner, 2)TA has to watch collusion attacks 
among any users in 10 countries. These tasks are considerably heavy due to the 
geographical condition. A straightforward modification solving these problems 
is to make 10 copies of TA so that every country has at least one TA. However, 
with this modification we have the problem that the whole system can be broken 
by leakage of secret of one TA. 

Using our scheme, all problems above are solved. That is, by setting up root 
TA and 10 child TAs, each child TA has only to communicate with domestic users 
and watch collusion attacks between them. Moreover, even if secrets of child TA 
are leaked, only its domestic users are exposed to danger, and communications 
among users in other countries are kept safe. We here compute required memory 
size in the straightforward modification and our scheme. In this example, hierar- 
chy level is 1 = 2, number of users sharing a key is t = 2, thresholds of collusion 
attack are cci = 10, u>o = 1250 and w = 10,000 (see Eq.(12)), key length is 
128 bit. According to Eq.(13) and Eq.(14), memory size of TA in the straight- 
forward modification is 800,240,016 Byte and that of root TA in our scheme 
is 826,981,056 Byte. They are approximately the same size. But in our scheme 
memory size of second level TA is more important because it plays the same role 
as TA in the straightforward modification. Its memory size is 137,830,176 Byte. 
This is a sixth of memory size of TA in the straightforward modification. From 
Eq.(17) and Eq.(18), memory size of a user in straightforward modification and 
our scheme is 160,000 Byte and 220,000 Byte, respectively. These are approxi- 
mately the same size. Similar cases of this example are often found in practical 
networks, and our system can be applied as well. 
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6 Conclusion 

In this paper, it is pointed out that to implement KPS and OTBES with con- 
ventional schemes the TA’s task is considerably heavy. In order to solve this 
problem, we propose an efficient method to set up multiple TAs hierarchically. 
By using our scheme, each TA has only to distribute secrets to its child entities 
and watch dishonest acts among them. Furthermore, even if a TA’s secret is 
leaked, only its child entities’ secrets are exposed to danger. Any other entities’ 
secrets are still kept secret. Our scheme is also efficient in terms of memory size. 
In our scheme required memory size for each TA is small. Moreover, required 
memory size for a user is very close to the lower bound on that in conventional 
KPS and OTBES. By using our scheme, large scale communication systems. Pay 
TV and other applications can be implemented very efficiently and securely. 
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Abstract. Certified electronic mail is a kind of fair exchange of values: a 
message for a receipt. An exchange is fair if at the end of the exchange, either 
each party receives the item it expects or neither party receives any useful 
information about the other's item. Fairness can be achieved through the 
involvement of a trusted third party (TTP). It is very interesting (and practical) 
the optimistic approach of involving a third party only in the case of exceptions: 
one party cannot obtain the expected item from the other party. Previous 
solutions using this approach implicitly assumed that players had reliable 
communication channels to the third party [2]. In this paper, we present an 
efficient (only three steps, the minimum), optimistic and fair protocol for 
certified electronic mail. 



1 Introduction 

Electronic mail, such as X.400 [24] or Internet mail [6, 20], is one of the more 
important electronic services over public and private networks. It is used by 
governments, private companies, banking, etc., and a lot of individuals. It is clear that 
electronic mail saves time, money, etc., but some flows have been pointed: one of 
them is lack of security. Some security problems related to electronic mail have been 
addressed: confidentiality, integrity and authenticity. We can find some proposals: 
S/MIME, PEM, PGP, etc. 

But if we want a total electronic integration, we have to pass all the services 
offered by postal companies to the correspondent electronic version. One of these 
services is certified mail. Like in the paper version, certified electronic mail is one 
service offered to the users, such as they want to obtain a receipt (bounded to the 
message) from the recipient. 

We handle certified electronic mail as a fair exchange of values: the originator has 
an item (a message, and possibly a non-repudiation of origin token) to be exchanged 
for a recipient’s item (the receipt, a non-repudiation of receipt token). It is not the only 
possibility, see [26]: they try to follow up the postal service literally. The exchange 
has to be fair in the sense that nobody wants to send its item if they don’t have the 
guarantee they will receive the expected item. We find different definitions for fair 
exchange in previous work: 

• an exchange is fair if at the end of the exchange, either each player receives the 
item it expects or neither player receives any additional information about the 
other’ s item [2] ; 
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• a non-repudiation protocol is fair if it provides the originator and the recipient with 

valid irrefutable evidence after completion of the protocol, without giving a party 

an advantage over the other at any stage of the protocol run [25]. 

As you can see, in fact, these two definitions (and most of the definitions we find in 
the literature) are analogous. 

Early solutions fall into one of the following two categories: gradual release of 
secrets (parties exchange the message and non-repudiation tokens “simultaneously”) 
and third party protocols (parties exchange the message and non-repudiation tokens, 
assisted by a TTP). 

The first approach [9, 19] achieves fairness by the gradual release of information 
over many rounds: some knowledge about the message and non-repudiation evidence 
is revealed during each round. At each round both parties are left with comparable 
knowledge about the other party’s item [27]. So, if either party stops before the 
protocol run is completed, both parties are in a similar situation. This approach seems 
to be too cumbersome and inefficient for actual implementation, due the high 
communication overhead. Moreover, fairness is based on the assumption of equal 
computational power. This assumption is unrealistic in practice and undesirable from 
a theoretical point of view [3]. 

We can find some fair exchange protocols that make use of a trusted third party [1, 
2, 4, 5, 7, 8, 10, 11, 12, 13, 17, 25, 27, 28]. They differ in the degree of TTP’s 
involvement in a protocol run. Following this viewpoint (TTP’s involvement), we 
classify the protocols into two classes: with active TTPs (a TTP is actively involved 
in every protocol run) and with subsidiary TTPs or optimistic protocols [1] (a TTP 
only intervenes in the case of an exception occurs, and so not in every protocol run). 

The solutions with TTP had, also, drawbacks: the trusted third party could become 
a bottleneck. Hence, one of the goals of designing an efficient certified electronic mail 
protocol is to reduce the TTP's intervention. Specially interesting are those protocols 
that only need the TTP's involvement in case of exception. Parties will exchange, 
following the sequence of steps specified in the protocol, their items. They hope to 
receive the expected item from the other party, and this will be the case if the protocol 
ends successfully. Otherwise, if one party is trying to cheat (or there are 
communication faults) the other party may contact the TTP to solve the unfair 
situation. Of course a cheating party can contact with the TTP, and so, good protocols 
have to foresee this possible situation. 

One problem arises in the previous kind of solutions: when have the parties to 
contact with the TTP? This problem could become a loss of fairness in the exchange. 
So, a interesting property of protocols is to be asynchronous: it means that both 
parties can contact with the TTP when they want, without losing fairness. 

Related with the temporary problem, we have the problem of the reliability of the 
communication channels. Proposed protocols for certified electronic mail with a TTP 
differ in the assumptions on the availability of communication channels. For example, 
[25, 27] present protocols that require the channels to the TTP be available for the 
parties before an specific time. The recipient must not be able to repudiate receipt of a 
message (if it was received), by falsely claiming the failure (communication 
problems, attackers’ actions, etc.) of the communication channel [27]. 

Asokan et alter, in [2], define three kind of channels: operational, reliable and 
resilient. We will make the weaker assumption that the channel between any party 
and TTP is resilient. Working in an resilient channel we can make no timing 
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assumptions (see previous paragraphs). Our protocol is thought having in mind that 
parties will never be able to argue a loss of fairness because a channel is resilient. 

Protocols are a sequence of steps followed by parties involved in the exchange. It 
is very important that the number of steps be less than possible [29]. We must think 
that every message exchanged by means of an off-line communication protocol (like 
e-mail) increases the total delay time. 

Of course, fairness is not the only property that must possess a protocol for 
certified electronic mail. For example, even if an exchange is totally fair, a posteriori, 
the originator and the recipient may not agree about what has been exchanged. Parties 
have to accumulate sufficient evidence to be used in case of dispute resolution. 

In this paper we present a protocol for certified electronic mail with the following 
characteristics: 

• fair: any party has no advantages at any point of a protocol execution; 

• optimistic: the TTP only intervenes in case of exception (the optimistic approach 
[1], one party cannot obtain the expected item from the other party); 

• asynchronous: at any stage of a protocol execution, either player can unilaterally 
choose to force an end to the protocol without losing fairness; 

• efficient: only three messages have to be exchanged between parties (the 
minimum). 

In section 2 we look over two previous protocols. In section 3 we outline our 
approach by describing the protocol for certified electronic mail, and discuss the 
dispute resolution. In section 4 we present an informal security analysis. We end the 
paper with a brief conclusions and future work. 



2 Previous Work: ASW and ZDB 

Some requirements for fair exchange were stated in [2], and re-formulated in [28]: 

1. Effectiveness. If two parties behave correctly, they will receive the expected items 
without any involvement of the TTP. 

2. Fairness. After completion of a protocol run, either each party receives the 
expected item or neither party receives any useful information about the other’s 
item. 

3. Timeliness. At any time during a protocol run, each party can unilaterally choose to 
terminate the protocol without losing fairness. 

4. Non-repudiation. If an item has been sent from party A to party B, A can not deny 
origin of the item and B can not deny receipt of the item [see 14, 15, 16]. 

5. Verifiability of Third Party. If the third party misbehaves, resulting in the loss of 
fairness for a party, the victim can prove the fact in a dispute. 

We would like to add a desirable property to be met: efficiency. It is important that 

the steps in any protocol be less than possible [29], specially if we think that 

exchanges are made using an off-line communication system (e-mail), and every step 
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increases the total delay. And also, it is desirable that cryptographic operations be less 
than possible. 

Finally, we think that, as an optional element for the user, but compulsory for the 
proposed protocols we have: privacy. If users involved in an exchange want 
confidentiality, it has to be possible to develop a confidential exchange, even for the 
TTP (in case this one has to intervene). 

ASW [2] and ZDB [28] protocols have, both, three sub-protocols: exchange, abort 
and resolve. In the normal case, only the exchange sub-protocol is executed, with four 
steps in both protocols. We agree with [28] that ASW protocol has some problems 
(the abort sub-protocol is flawed, etc.). As regards de ZDB protocol only a minor 
criticism: 

• In the resolve sub-protocol Originator or /Recipient send the EORjC token to the 
TTP, and then this element {EORjC token) is returned by the TTP; this return can 
be eliminated. 

• The protocol has four steps in the exchange sub-protocol, and in this paper we 
prove that can be made in three steps. 

• It is necessary an abort sub-protocol for O, and a resolve sub-protocol for O and R, 
so the code to implement the protocol has to be longer than the protocol introduced 
in this paper. 

• It pretends to be verifiable in relation to the TTP, but it is based on the assumption 
that the TTP can be forced to respond. The definition of a resilient channel in [2] 
establishes that a message inserted in this kind of channel will be delivered, but in 
this definition they are not compelling to the TTP to send any message. Of course, 
it is easy to prove in front an arbiter that the TTP is not acting properly (it is not 
sending the expected message). 



3 An Efficient Protocol 

We will begin describing the proposed protocol. Then we will discuss the possible 
disputes resolution. 



3.1 Protocol 

The originator, A(lice), and the recipient, 5(ob), will exchange messages and non- 
repudiation evidence directly. Only as a last recourse, in the case they cannot get the 
expected items from the other party, the TTP (7) will be invoked, by initiating the 
cancel or finish sub-protocols. 

Our protocol is specially suited to be used with RSA [23], but it is easy to extend 
the proposed protocol to an arbitrary asymmetric cryptographic scheme. In the 
following description, we have not included elements to link messages of an 
exchange, nor operations to achieve confidentiality, in order to simplify the 
explanation. The notation and elements used in the protocol description are as 
follows: 
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Table 1. Notation and elements 



1 notation \ 


X, Y 


concatenation of two messages X and Y 


H(X) 


a collision-resistant one-way hash function of 
message X 


PR,[H(X)] 


digital signature on message X with the private 
key, or signing key, of principal i (using some 
hash function, //(), to create a digest of X) 


i^j:X 


principal i sends message (or token) X to 
principal j 


1 elements \ 


M 


message to be sent certified from A(lice) to 
B(ob) 


c = E,(M) 


symmetric encryption of message M with key 
K, producing the ciphertext c 


kg = PUg(K) 


key K enciphered with the public key of the 
TTP (it means that only the TTP, who knows 
the correspondent private key, can decrypt it) 


h, = PRJH[H(c),kg]} 


signature of A on the concatenation of the hash 
of ciphertext c and part of the evidence of 

NRO (non-repudiation of origin) for B 


hg = PRg{H[H(c), kg]) 


signature of B on the concatenation of the hash 
of ciphertext c and kj^ evidence of NRR (non- 
repudiation of receipt) for A 


k, = PR,("key=",K) 


key K enciphered with the private key of A 
(and so, signed), second part of NRO evidence 
for B 


kg’ = PRg(”key=", K) 


key K enciphered with the private key of the 
TTP (and so, signed), alternative second part of 
NRO evidence for B 


h,g = PRJH[H(c), kg, hj} 


this token is an evidence that A has demanded 
TTP's intervention 


hgg = PRg{H[H(c), kg,h„hj) 


this token is an evidence that B has demanded 
TTP's intervention 


h„’=PRJH(hJl 


signature of TTP on h„ to prove its intervention 



The exchange sub-protocol is as follows: 

1. A — » B: c, kj, 

2. B ^ A: h, 

3. A^B: 

If the protocol run is completed, the originator A will hold non-repudiation of 
receipt evidence, /tg, and the recipient B will hold the message M. B can decrypt k^, 
with the public key of A, obtaining the key K, and then he can decrypt the ciphertext c 
using that key K. If it is not the case, A or B, or both, need to rectify the unfair 
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situation by initiating the cancel or finish sub-protocol, respectively, so that the 
situation returns to a fair position. 

If A “says” (A could be trying to cheat or being in a wrong conception of the 
exchange state) that she has not received message 2 from B, A may initiate the 
following cancel sub-protocol: 

r. A ^ T: H(c), k,, h^, h„ 

IF (finished = true) 2'. T: retrieves hg 

T — > A: hg, hg’ 

ELSE 2'. T ^ A: PRg[H(“cancelled”, hj] 

T : stores cancelled = true 

The TTP will verify the correctness of the information given by A. If it is not the case 
the TTP will send an error message to A. Otherwise, it will proceed in one of two 
possible ways. If the variable finished is true, it means that B had previously 
contacted with the TTP (see paragraph below). The TTP had given the key K 
(decrypting kj with its private key) to B, and now it has to give the NRR token to A. 
So, it retrieves this stored NRR token, h^, and sends it to A, and a token to prove its 
intervention, hf. If B had not contacted with the TTP previously, the TTP will send a 
message to A to cancel the transaction, and it will store this information {cancelled = 
true) in order to satisfy future petitions from B. Whatever case, now, we are again in a 
fair situation. 

If B “says” that he has not received message 3, B may initiate the following // mA/! 
sub-protocol: 



2'. B ^ T: H(c), k^, h^, hg, hg^ 

IF (cancelled = true) 3'. T ^ B: PRg[H(“cancelled”, hg)] 

ELSE 3'. T ^ B: kg’ 

T: stores finished = true, and hg 

The TTP, also, will verify the correctness of the information given by B. If it is not 
the case the TTP will send an error message to B. Otherwise, it will proceed in one of 
two possible ways. If the variable cancelled is true, it means that A had previously 
contacted with the TTP (see paragraph above). The TTP had given a message to A to 
cancel the transaction, and now it has to send a similar message to B. If A had not 
contacted with the TTP previously, the TTP will send the key K (obtained by the 
decryption of kj. with its private key) re-encrypted with its private key, so B will be 
able to decrypt the ciphertext c. In this case the TTP will store the NRR token, hg, and 
will assign the value true to the finished variable, in order to satisfy future petitions 
from A. Again, whatever case, now, we are in a fair situation. 



3.2 Dispute Resolution 

After a protocol run is completed (with or without the participation of the TTP), 
disputes can arise between participants. We can face with two possible types of 
disputes: 
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• repudiation of origin: B claims having received M from A, and A denies having 

sent M to B ; and 

• repudiation of receipt: A claims having sent M to B, and B denies having received 

M. 

An external arbiter (not part of the protocol) has to evaluate the evidence held and 
brought by the parties, to resolve these two types of disputes. As a result of this 
evaluation, the arbiter will determine who has right on its side. 

In the case of repudiation of origin, B is claiming that he received the message M 
from A. He has to provide the following information to an arbiter: M, c, and 
or kj. The arbiter will check if is A’s signature on (H(c), kj), and if it is positive the 
arbiter will assume that A had sent c to B. Then, the arbiter will check if k, is A’s 
signature on K, or will check if k/ is TTP’s signature on K. If this check is positive, 
the arbiter will assume that either A or the TTP had sent the key K to B. To end, the 
arbiter will check if the decryption of c, DJ^c), is equal to M. If this final check is 
positive, the arbiter will side with B. Otherwise, if one or more of the previous checks 
fail, the arbiter will reject B’s demand. If the evidence held by B proves he is right, 
and A holds a message like P/?j.[//(“cancelled”, h^], it means that the TTP had acted 
improperly (see section 4). 

In the case of repudiation of receipt (repudiation of B), A is claiming that B had 
received M. She has to provide the following information to an arbiter: M, c, k^^ and 
K. The arbiter will check if is B’s signature on (H(c), kj), and if it is positive the 
arbiter will assume that B had received c and k^, and that he is committed to obtain the 
key K (and therefore the message M). Then, the arbiter will check if kj is the 
encryption of K under the public key of the TTP. If this check is positive, the arbiter 
will assume that B could obtain the key K from the TTP (if he had not received from 
A). If the previous checks fail, the arbiter will reject A’s demand. If checks are 
positive, will the arbiter side with A? The answer is not, at the moment. First the 
arbiter should interrogate B. If B contributes a message like P/?j.[//(“cancelled”, hj\, 
it means that B had contacted with the TTP, and the TTP observed that A had already 
executed the cancel sub-protocol. For this reason the TTP sent the cancel message to 
B. Now it is demonstrated that A has tried to cheat. Therefore, the arbiter will reject 
A’s demand, and the arbiter will side with B. If B cannot contribute the cancel 
message, the arbiter will check if the decryption of c, DJ^c), is equal to M. If this final 
check is positive, the arbiter will side with A. Otherwise, if the previous check fails, 
the arbiter will reject A’s demand. If the evidence held by A proves she is right, and B 
holds a message like PPj.[//(“cancelled”, hj\, it means that the TTP had acted 
improperly (see section 4). 



4 Security Analysis 

Next, we give an informal security analysis of the proposed protocol, checking the 
requirements listed in Section 2. We assume that the communication channels are 
resilient. 



Claim 1. The protocol is effective. 
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Proof: If A and B follow the steps listed in the exchange sub-protocol, and the 
communication channel between them is resilient, they will receive the expected 
items without TTP’s intervention. A will receive NRR evidence (h^), and B will 
receive c and K, and thus M = D^(c), and NRO evidence (h^, kj. 

Claim 2. The protocol is fair. 

Proof: A may face the following unfair situation: she did not receive any message 
from B after sending message 1 in the exchange sub-protocol. A can initiate the cancel 
sub-protocol, which is guaranteed to be completed within a finite period if the channel 
with the TTP is resilient. If B has already finished the protocol, A will obtain and 
hf from the TTP, which can be used to prove that B received M. If B has not finished 
the protocol, the TTP will not finish the protocol at a later time, thus B can not obtain 
K, and thus M. 

B may face the following unfair situation: he did not receive k^, and so K, after 
sending message 2 in the exchange sub-protocol. B may initiate the finish sub- 
protocol (which is guaranteed to be completed within a finite period under the same 
assumption) to obtain K from the TTP. If A has not cancelled the protocol, the TTP 
will send kf to B, and it will not cancel the protocol at a later time. Thus, A can not 
obtain a cancel message from the TTP, and B has h^ and kf to prove he received M 
from A. If A has already cancelled the protocol, B will obtain a cancel message from 
the TTP, which can be used to prove that B did not receive M. 

So, the protocol satisfies the fairness requirement for A and B. At no point of a 
protocol run does either participant have an advantage. Any party can conduct the 
protocol to a symmetric situation invoking the TTP. 

Like a consequence of the two previous claims, the protocol will be very efficient 
in an environment where two parties usually play fair. In fact, knowing that the TTP 
will always conduct the situation to a symmetric point (and so, fair), we hope that 
almost all participants will act properly, and only in rare exceptions an intervention of 
the TTP will be necessary. 

Claim 3. The protocol meets the timeliness requirement. 

Proof: A can conclude a protocol run after sending message 3 in the exchange sub- 
protocol, or invoking the cancel sub-protocol at any time before sending message 3 in 
the exchange sub-protocol. The cancel sub-protocol initiated by A is guaranteed to be 
completed within a finite period (the communication channel between the TTP and A 
is resilient). B can conclude a protocol run after receiving message 3 in the exchange 
sub-protocol, or invoking the. finish sub-protocol at any time after receiving message 1 
in the exchange sub-protocol. Thus, at any time there is always a way for A and B to 
conclude fairly the protocol run. 

Therefore, it is not necessary to specify a deadline to finish the protocol. When A 
has sent the message 1, she has to wait the message 2 from B, but whenever she wants 
she can contact with the TTP to cancel the execution (in case she doesn’t receive 
message 2, or even if she has received it but she wants to cheat). The same can be said 
about B, if he doesn’t receive message 3. So we can affirm that our protocol is totally 
asynchronous. Being independent of any time parameter is a great advantage in 
respect to other solutions proposed in the literature. 
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Parties can agree a reference time, t, they want to have finished the exchange. It 
could he desirable for A and B to know, as a temporary reference, when they have to 
contact with the TTP. This agreed time will not affect the protocol’s fairness and the 
TTP's involvement described in section 3.1. It is only a reference time that will help 
them to solve unfair situations. Thus, the protocol provides fairness when a deadline 
for a protocol run exists and when there is no deadline. 

Claim 4. The protocol satisfies the non-repudiation requirement. 

Proof: If the protocol ends normally, B will hold the following NRO evidence: {h^, 
k^y, or (h^, kj’) in case of TTP's intervention. A will hold the following NRR evidence: 
hg, and h/if the TTP has intervened. 

The token proves that A sent c to B, while k^ proves that A sent K to B. Thus (h^, 
kJ proves thatM= DJc) is from A. Alternatively, A:,,’ proves that the TTP notarized K, 
and (h^, k/) also proves that M = D^(c) is from A. The token proves that B received 
c, and so he is compelled to finish the protocol (normally or with the finish sub- 
protocol). The token hf proves that B received A:,.’ from the TTP. 

Claim 5. The protocol satisfies the verifiability requirement, if the TTP can be forced 
to send a valid response to any request send to it. 

Proof: A TTP's possible misbehavior is: A receives and hf, while B receives the 
cancel token. If A uses and hf to prove that B received M, B can use the cancel 
token to prove the TTP's misbehavior. If A received and then she cancelled the 
exchange (and so she did not receive hg), and try to use hg to prove that B received M, 
and the TTP has the h^j. token, it is clear A's misbehavior. 

The other TTP's possible misbehavior is: B receives A:^.’ while A receives the cancel 
token. If B uses and kg’ to prove that A sent M to B, A can use the cancel token to 
prove the TTP's misbehavior. It should be noted that if B uses and k^ to prove that A 
sent M to B,A can not use the cancel token to prove the TTP's misbehavior, since the 
TTP did not issue conflicting evidence (clearly A is misbehaving). 

Claim 6. The protocol meets the efficiency requirement, needing the minimum 
number of steps (three) for an optimistic fair protocol. 

Proof: Clearly, a two-steps protocol is impossible to develop a fair exchange. If A 
sends her element to B, B possesses the element from A and now he can decide not to 
send his item to A. 

Claim 7. It is possible, and very easy, to conduct a confidential exchange between A 
and B, even for the TTP (in the case this one has to intervene). 

Proof: A and B can exchange a secret key of a symmetric cryptosystem, k, using some 
key-exchange protocol (see [18, 21], for instance encrypting k with the public key of 
B). Then, they can re-encrypt the ciphertext c of the exchange sub-protocol with that 
key k only known by them. Now we have to analyze the confidentiality of the 
exchange for two possible situations: 
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• the TTP has not intervened in the exchange: observe that only B can decrypt the re- 
encryption made with key k. 

• the TTP has intervened in the exchange: observe that the TTP only needs a hash of 
c to verify the correctness of the information given by A or B, and so, the TTP can 
not obtain the message M, even if it intercepted the re-encrypted message (with K 
and k) in the communication channel between A and B. 



5 Conclusion and Future Work 

We have presented a fair protocol for certified electronic mail. The fairness is 
guaranteed provided the existence (and possible involvement) of a trusted third party, 
that plays a subsidiary role (only intervenes in case of exception). Compared with 
other solutions, our protocol is doubly efficient: we have reduced the involvement of 
the TTP and the number of steps in the protocol. The only assumption made, respect 
the communication channels, is the following: the communication channels between 
parties and the TTP are not permanently broken. Finally, the described protocol does 
not require any deadline to guarantee fairness: each party can contact the TTP when it 
wants. 

Our protocol (as [28]) has to face coalitions of parties with the TTP. With the 
information contained in the first message of the exchange sub-protocol, the TTP can 
give the key K to B, and then if A contacts with it, the TTP can send a cancel message 
to A. So, B can read the message and A has not a receipt. Of course B has not a non- 
repudiation of origin (NRO) token from A, but in many situations for B it is not so 
important a NRO token (in certified electronic mail) as to be able to read the message. 
In [22] we find a solution with a set of TTPs, being resistant to a sub-set of malicious 
third parties. 
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Abstract. In recent years, protocols have been developed to ensure se- 
cure communications over the Internet, e.g., the secure sockets layer 
(SSL) and secure electronic transaction (SET). Deployment of these pro- 
tocols incurs additional resource requirements at the client and server. 
This may have a negative impact on system performance. In this pa- 
per, we consider a scenario where users request information pages stored 
on a web server, and some of the requests require secure communica- 
tion. An analytic model is developed to study the performance of a web 
server based on SSL. In our model, the details of the client-server inter- 
actions found in a typical SSL session are represented explicitly. Input 
parameters to this model are obtained by measuring an existing SSL im- 
plementation. Numerical examples on the performance characteristics of 
SSL are presented. 



1 Introduction 

In recent years, we have witnessed a general acceptance of the Internet by bu- 
sinesses and consumers. A key requirement to the success of Internet business is 
secure communication. It is generally known that messages sent on the Internet 
are subject to three types of security threats, namely eavesdropping, modifica- 
tion, and impersonation [1]. Trusted security mechanism and protocols have been 
developed to ensure secure communications over the Internet. An important se- 
curity protocol is the secure sockets layer (SSL) [2,3]. It has been implemented 
in all the major web browsers and in web servers like Apache [4], Lotus Domino 
server [5] and IBM HTTP server [6]. Another important security protocol is 
secure electronic transaction (SET) [7]. 

Deployment of security protocols, such as SSL and SET, incurs additional 
resource requirements at the client and server. This may have a negative impact 
on system performance, e.g., increased response time. Performance evaluation 
of SSL, SET or other security mechanisms, has not received much attention 
until recently. In [8], the performance improvement of SSL when caching of 
session keys is used, is evaluated. In another study [9], performance results for a 
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transport layer security protocol are presented. That study was based on the use 
of a benchmark tool SPECweb96 [10]. In addition, a performance comparison of 
SSL and SET is reported in [11]. 

Our work is different from those in [8,9,11] in the sense that analytic mo- 
deling is used to study the performance of SSL. More specifically, we consider 
a scenario where users request information pages stored on a web server, and 
some of the requests require secure communication. A model is developed to re- 
present the details of client-server interactions during a typical SSL session. As 
part of our investigation, we obtain input parameters to our model by measu- 
ring an existing SSL implementation. Analytic results are then obtained; these 
results are used to evaluate the performance penalty incurred by SSL. The issue 
of scalability is also investigated. The advantages of our model include (i) the 
effects of implementation are represented by the actual measurements and thus 
the model does not have to consider implementation details, and (ii) the mean 
response time results are valid for arbitrary distribution patterns of user think 
time and the various service time parameters. 

The rest of this paper is organized as follows. In section 2, the details of 
a typical SSL session are described with a view to capturing the client-server 
interactions. Our model is described in section 3. Exact analytic results for the 
mean response time are derived. System measurement techniques to obtain input 
parameters are also described. In section 4, numerical examples showing the 
performance characteristics of a secure web server based on SSL are presented. 
These examples are designed to show the impact of the number of users, the 
fraction of user requests that require secure communication, the type of crypto- 
graphic algorithms used, and the size of the html file retrieved, on response time 
performance. Einally, section 5 contains a summary of our hndings and some 
concluding remarks. 



2 Secure Sockets Layer Protocol Session 

Secure Sockets Layer (SSL) protocol is designed to provide secure communi- 
cation. It performs server authentication, and optionally client authentication. 
With SSL, private information is protected through encryption, and a user is 
ensured through server authentication that he/she is communicating with the 
desired web site and not with some bogus web site. In addition, SSL provides 
data integrity, i.e., protection against any attempt to modify the data transferred 
during a communication session. 

Our investigation of SSL is based on a publicly available implementation 
called SSLeay [12]. This would allow us to collect measurement data that can be 
used to characterize the input parameters. Our measurement experiments are 
based on SSLeay version 0.6.6b, which is an implementation of SSL v2. This 
version is selected because it is a stable implementation and is therefore a good 
candidate to illustrate our modeling approach. Our model can easily be extended 
to study the behavior of later versions of SSL (v3 and transport layer security 
[13]). 
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For convenience, we consider the case where an SSL session is set up to 
request a web page (or https). To facilitate the development of our model, we 
show in Figure 1 the events and activities as seen by the web server. The SSL 
session starts at the end of a user think time, which is indicated by the reception 
of a TCP connection request on port 443 - the default port number assigned to 
https. The web server then proceeds with setting up the TCP connection, and 
this activity ends when the TCP connection is established. The web server then 
waits for a “client hello” message. When such a message is received, the server 
parses the message, and prepares a “server hello” message for transmission back 
to the client. 



Event Time Activity 


Previous SSL Session ends 


T 

User think time 


TCP connection request received ■ 


Connection set up time at server 


TCP connection established ■ 


Server waits for client hello message 


Client Hello message received ■ 


Server prepares and transmits server hello message 


Server Hello message transmitted 


Server waits for client master key message from client 


Client master key message received ■ 


Server decrypts client master key, and prepares and 
transmits server verify message 


Server verify message transmitted ■ 


Server waits for client finish message from client 


Client finish message received ■ 


Server prepares and transmits server finish message 


Server finish message transmitted 


Server waits for http request from client 


http request received ■ 


Server prepares and transmits http response 


http response transmitted ■ 
(SSL session ends) 





Fig. 1. SSL session 



After the “server hello” message has been transmitted, the web server waits 
for a “client master key” message from the client. Upon receiving this message, 
the web server decrypts the client master key, and prepares a “server verify” 
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message and transmits it to the client. At end of transmission, the server waits 
for a “client finish” message. When this message is received, the server prepares 
a “server finish” message for transmission to the client. The SSL handshake ends 
when this message is transmitted. 

The web server next waits for an http request from the client. Upon receiving 
this request, the web server prepares an http response. When this response is 
transmitted, the SSL session ends and the user starts the next think time. 

Similarly, we show in Figure 2 the events and activities for the case where 
secure communication is not required, i.e., a normal http request. 



Event Time 

Previous session ends -- 

TCP connection request received - - 
TCP connection established -- 

http request received -- 

http response transmitted -- 
(session ends) i r 



Activity 



I 

User think time 

+ 

Connection set up time at server 

I 

T 

Server waits for http request from client 

I 

T 

Server prepares and transmits http response 

_L 



Fig. 2. Normal http request 



3 Performance Model and System Measurement 

3.1 Performance Model 

Our performance model is a closed queueing network representing N users re- 
questing web pages from a web server (see Figure 3). There are two types of 
requests: type 1 requires secure communication and type 2 is normal http. A 
request is generated at the end of a user think time. This request is of type 1 
with probability p and of type 2 with probability 1 — p. 

There are two service centres: the web server and a “delay” server. The delay 
server represents all web server activities while waiting for the client, and any 
network delays. Each type 1 request cycles through the web server and delay 
server a number of times, following the activities shown in Figure 1. As a result, 
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Fig. 3. Performance model of secure communication sessions 



a type 1 request visits the web server several times, as shown by the stages of 
service in Figure 4. The type 1 request also visits the delay server a number 
of times, shown by the stages in Figure 5. After visiting the web server for the 
last time (which corresponds to the activity “server prepares and transmits http 
response”), a type 1 request is complete, i.e., the http response is returned to 
the user, and the user starts the next think time. Similarly, for type 2 requests, 
the stages of service at the web server and delay server are shown in Figures 6 
and 7 respectively. 

In general, processing of a user request requires usage of resources such as 
processor, memory and database servers. For convenience, we assume that the 
required resources are approximated by a number of service time parameters, 
one for each of the web server or delay server stages. Furthermore, the web server 
is modeled by a single server queue with processor-sharing discipline [14]. This 
discipline provides fair sharing of resources among all outstanding user requests. 
The user terminal is modeled as an “infinite server”, or no queueing. This is a 
well-accepted model for interactive systems. The delay server is also modeled as 
an “infinite server” . This assumption can be justified as follows. The time spent 
by a request at the delay server has two components: processing at the client 
and network delay. Each client is assumed to be running on a dedicated machine, 
and hence, there is no queueing at the client. As to the network delay, we assume 
that the time spent at the server machine to transmit or receive packets is small 
compared to that required for SSL protocol processing, similarly for time spent 
in the transport network. The no queueing assumption can therefore be used for 
the network delay component also. 

For our queueing network model, the input parameters are the number of 
users N, the mean think time h, the fraction of user requests that require secure 
communication p, and the mean service time of each stage at the web server and 
delay server for each of the two types of requests (as shown in Figures 4 to 7). 
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Stage of service at web server 


Mean service time 
(measured) 


Connection set up time at server 


0.406 msec 


Server prepares and transmits server hello message 


0.480 msec 


Server decrypts client master key, and prepares 
and transmits server verify message 


11.815 msec 


Server prepares and transmits server finish message 


0.278 msec 


Server prepares and transmits HTTP response 


1.937 msec 



Fig. 4. Stages of service at web server for type 1 requests 



Stage of service at delay server 


Mean service time 
(measured) 


Server waits for client hello message 


2.536 msec 


Server waits for client master key message 


12.407 msec 


Server waits for client finish message 


0.472 msec 


Server waits for http request 


3.921 msec 



Fig. 5. Stages of service at delay server for type 1 requests 



Stage of service at web server 


Mean service time 
(measured) 


Connection set up time at server 


0.322 msec 


Server prepares and transmits http response 


1.911 msec 



Fig. 6. Stages of service at web server for type 2 requests 



3.2 Analytic Results 

Our model belongs to the types of queueing network models analyzed in [14], 
Specifically, each web server or delay server stage is represented by a seprate 
customer class, and the class change feature allows us to model the customer 
routing between the web server and delay server, as characterized by the orde- 
ring of activities shown in Figure 1. The results from [14] are therefore directly 
applicable. 
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Stage of service at delay server 


Mean service time 
(measured) 


Server waits for http request 


0.094 msec 



Fig. 7. Stage of service at delay server for type 2 requests 



For our investigation, the state description of interest is 5 = (no, (nn,ni 2 ), 
(n 2 i,n 22 )) where uq is the number of users in the thinking state, and riir and 
U 2 r are the number of type r requests at the web server and the delay server 
respectively, r = 1, 2. Using the results in [14], we obtain P{S), the steady state 
probability that the system is in state S. From P{S), one can readily obtain the 
mean values for no, nn, ni 2 , n 2 i and U 22 (denoted by no, nn, rii 2 , n 2 i and n 22 
respectively). 

Let Tr be the mean response time of type r requests, we use Little’s formula 
[15] to obtain: 



and 



— (nii+n2i)h 

J- 1 — ^ 

nop 



— _ (ni2 + n22)h 
^ no(l-p) 

Finally, the mean response time over all requests is 



( 1 ) 

( 2 ) 



_^(fV-^ (3) 

no 

It should be noted that response time is measured from the beginning to end 
of an SSL session (or a normal http session). It includes the delays while waiting 
for the client process, and any network delays. 

It should also be noted that the formulas for the mean response time (Ti, 
T 2 and T) are valid for a wide range of distributional assumptions for the think 
time, and for the service times at the various web server and delay server stages. 
The reason is that the steady state probability P{S) depends only on the mean 
think time and the mean service times at these stages [14]. 



3.3 System Measurement 

In this subsection, we discuss our methodology to obtain the mean service time 
for each of the web server and delay server stages. We built a web server based 
on SSLeay [12]. The server is written in C and compiled with gcc 2.6 with 
optimization. Our experimental system consists of a Sun Ultra 10 running SunOS 
5.6, which works as the web server. The client machine is a Sun SPARCstation 
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10 running SunOS 5.5. These two machines are connected to an 100 Mbit/s 
Ethernet. The required service time parameters can be determined if we know 
the event times of all the events in Figures 1 and 2. This can be accomplished 
if we are able to measure the time at which a message is received at the server, 
and the time at which transmission of a message at the server is finished. 

To measure the time at which a message is received at the server, we use 
the I/O multiplexing model of Unix network I/O [16]. This model allows us to 
determine exactly when data for a given message are received. We thus modify 
the available SSLeay source code and place a “select” system call before every 
server subroutine that attempts to receive data from client. Each time a message 
is received, the event time is obtained by using the “gettimeofday” system call 
which returns the current time at a resolution of microseconds. 

Measuring the time at which transmission of a message at the server is finis- 
hed is quite straight-forward. One simply uses the “gettimeofday” system call 
to get the event time when transmission is complete. 



3.4 Measurement Results 

We conducted two measurement experiments to obtain values for the service time 
parameters. In these experiments, we used a scenario of only one user interacting 
with a web server. The html file size is 1.0 Kbyte. Public-key cryptography is 
based on 512-bit RSA [17], and secret-key encryption is based on RC4 [17]. 

Data were collected under the condition of no other applications running 
on the server and client machines. The first experiment involves SSL sessions 
(or type 1 requests) only, while the second experiment is concerned only with 
normal http (or type 2 requests). The measured values of mean service time at 
the various web server and delay server stages are included in Figures 4 to 7. 



4 Numerical Examples 

For closed queueing networks, the amount of computation required to obtain 
numerical results for performance measures such as mean response time may 
be substantial. Efficient computational algorithms are widely available. For our 
model, numerical results are obtained by using the algorithm reported in [18]. 

We first investigate the effect of the number of users N and the parameter 
p on the mean response time over all requests. The results for p = 0.1, 0.2 and 
0.3, are shown in Figure 8. The mean think time h is assumed to be 2 seconds.^ 
It is observed that with a larger p, there is a performance penalty (i.e., a larger 
mean response time) even when N is small. The incremental mean response 
time increases with N and p. The performance penalty incurred by SSL can be 
explained as follows. 

^ This value of h is selected such that the amount of computation is not excessive 
for reasonably large values of N. For a larger h, the performance characteristics are 
similar, except the system is able to support more users. 
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Fig. 8. Mean response time over all requests 



Under SSL, significant processing time is required at the web server to 
decrypt the client master key because public-key cryptography is used. As shown 
in Figure 4, the mean service time for this activity is 11.815 msec compared to 
1.937 msec for preparing an http response. Also, in SSL, secret-key cryptography 
is used when the web server prepares (i) the “server finish” message and (ii) the 
http response. This would add to the service times for the SSL session. Further 
delays are incurred because cryptographic hash functions are computed for all 
server activities except TCP connection establishment. In addition, an SSL ses- 
sion has more client-server interactions than the normal http. This is a result 
of the handshake protocol required to set up the SSL session. These additional 
interactions will incur delay, not only because of the processing required, but 
also the need to wait for network round-trip time, and any processing required 
at the client. 

To further illustrate the response time performance of SSL, we plot in Figure 
9 and 10 the mean response time of type 1 and type 2 requests respectively. It is 
observed that an increase in p results in performance degradation for both types 
of requests. 

Our second set of results are concerned with the impact of increasing the 
security strength on SSL performance. Consider first public-key cryptography. 
The results in Figure 8 are based on measured data for an 512-bit RSA. The 
longer the RSA key size, the more secure is the SSL protocol. We measured the 
processing time required on a Sun Ultra 10 for three different RSA private key 
sizes, and the results are shown in Table 1. Numerical results for mean response 
time are then obtained for these three key sizes and for p = 0.2. These results are 
shown in Figure 11. We observe that the response time performance degrades 
significantly as the RSA key size increases. The reason is that RSA private 
key cryptographic operation incurs a significant overhead at the web server. 
The response time can be improved by using an RSA public key cryptography 
accelerator that is implemented in hardware, e.g., CryptoSwift II [19]. 
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Fig. 9. Mean response time of type 1 requests 




Fig. 10. Mean response time of type 2 requests 
Table 1. RSA private key cryptographic operation performance 



RSA Key Size 


Performance 


512 bits 


11 msec 


1024 bits 


68 msec 


2048 bits 


481 msec 



We next consider the performance impact of secret-key encryption. The re- 
sults in Figure 8 are based on measured data for RC4. When a higher level of 
security is required, recommended encryption algorithms include IDEA [17] and 
Triple DES [17]. We measured the encryption rate on a Sun Ultra 10 for these 
three algorithms, and the results are shown in Table 2. Numerical results for 
mean response time are then obtained for these three cases, and for p = 0.2. 
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These results are shown in Figure 12. We observe that the mean response time 
increases as stronger secret-key encryption algorithms are used. The increase 
is mainly due to the overhead incurred by secret-key cryptographic operation 
when applied to the html file. Compared to the results in Figure 11, we observe 
that for the parameters under consideration, the performance penalty is small 
compared to that incurred by public-key cryptography. However, the observation 
may be different if the html file size is much larger. This is due to the higher 
overhead incurred by secret-key cryptography, resulting in a significant increase 
in the mean response time. To illustrate this point, we show in Figure 13 the 
mean response time performance when the html file sizes are 1 and 2 Mbytes 
instead of 1 Kbyte. Compared to the results in Figure 11, the performance pen- 
alty resulting from secret-key encryption is comparable to that from public-key 
cryptography. 



Table 2. Performance of RC4, IDEA and Triple DES 



Encryption Algorithm 


Encryption Speed 


RC4 
IDEA 
Triple DES 


10568.51 Kbyte/sec 
2794.01 Kbyte/sec 
809.23 Kbyte/sec 



5 Conclusions 

We have presented an analytic model for studying the performance of SSL. Our 
model is based on a detailed representation of the client-server interactions re- 
sulting from an SSL session. Numerical results showing the tradeoff between 
security and response time performance have been obtained. These results pro- 
vide useful insight into the performance characteristics of a secure web server 
under a range of operating conditions. 

Our analytic model is rather general because the results are valid for arbitrary 
think time and arbitrary service time distributions at the web server and delay 
server. In addition, our modeling approach can readily be extended to study 
other secure communication protocols, e.g., SET. 
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Abstract. IBM Research and the U.S. Department of Defense teamed 
to determine if governmental high assurance practices could be applied 
to commercially available network computers. The focus of the project 
was on using the thin client computing architecture to connect to trusted 
information domains of different classification levels at different times. 
Importantly, the information from a given classified domain must not 
migrate from its domain. Achieving this goal requires state clearing, and 
encrypting and authenticating all transferred data between the thin cli- 
ent and the trusted domain. 



1 Introduction 

Sherlock was a research project of the U.S. Department of Defense (DoD), inve- 
stigated in partnership with IBM Research. Both parties undertook the project 
to determine whether governmental assurance needs could be met starting from 
commercial off-the-shelf (COTS) products. In brief, we found that, with certain 
carefully chosen enhancements, commercial products can achieve surprisingly 
high levels of assurance and can be used in some situations traditionally reser- 
ved for government designed devices. In the process, we designed and prototyped 
a system for secure networking. 

While the majority of the function described here has been implemented 
in the Sherlock system prototype, a small portion has been implemented but 
not integrated, and a smaller portion has been designed, but not implemented. 
As this paper is an extended abstract discussing early results, it is impractical 
to point out the current state of each component. Therefore, we describe the 
prototype as if it has already been fully implemented and integrated. We believe 
that every component for which “invention” was required has been implemented 
successfully; thus, we believe that the results we describe here are valid. 

This paper is organized as follows: Section 2 describes the backdrop against 
which the project was conceived. Section 3 outlines the Sherlock system archi- 
tecture and the design of its components. Section 4 discusses our results to date. 
Section 5 presents the key conclusions of the Sherlock project. The Appendix 
contains a description of the software incorporated in the prototype. 
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2 Background 

Within the Intelligence Community, users need to access multiple computer sy- 
stems of varying classification levels, often on a daily basis. Where once there was 
one network to which everyone had access, now there are multiple networks with 
different classifications, accessed via different computers. While this architecture 
is highly secure, it is also very expensive. 

The first goal of Sherlock was to determine whether a single, inexpensive 
computer could practically replace for many users several of their present com- 
puters. The second goal was to expand the range of services offered to users, e.g., 
the ability to login to one agency’s networks from another’s computers, and to 
dial in to various networks from home, potentially via the Internet. Of course, 
we needed to balance these goals with the need to maintain very high security. 

2.1 Governmental High Assurance 

“High assurance” has as many interpretations as people who use the term. A ty- 
pical government definition might be, “the ability to prove beyond a reasonable 
doubt that a security function shall work in the specified manner designed, and 
that suitable protective techniques have been correctly implemented to prohibit 
the malfunction and/or denial of that security function.” A weaker definition 
might be that, should the security device fail, it must do so without compromi- 
sing security, i.e., it must be “fail-secure.” 

Government equipment that achieves high assurance must implement many 
specialized techniques in hardware and software. These techniques often imply 
extensive design and development of customized hardware and software, which 
tends to be very time consuming and very expensive. Unfortunately, implemen- 
ting these techniques often limits the flexibility of the equipment when mission 
parameters change, and tends greatly to increase the complexity as seen by users 
of the equipment. 

In areas that require strong security, these costs must be met and any compro- 
mise is unacceptable. In applications with less stringent security requirements, 
however, it may be possible to leverage the best commercial practices combined 
with some specialized high assurance techniques, not only to provide a cost- 
effective and flexible solution, but also to provide a level of assurance. 

2.2 Commercial High Assurance 

When designing the Sherlock system, we defined the concept “commercial high 
assurance,” as the highest assurance that can be achieved while maintaining 
commercial viability. We did so in recognition that commercial realities make it 
essentially impossible to achieve governmental high assurance for a product that 
is to be sold in both the government and commercial markets. 

Although the government market is large, its size pales in comparison with 
that of the commercial market. The cost and complexity of providing specialized 
hardware and software for the government market versus that of providing off- 
the-shelf solutions for the commercial market makes it difficult for a company 
to justify dedicating substantial resources to the government market. 
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Even more important, the economies of scale that play a large role in the 
commercial market play a much smaller role in the government market. In the 
commercial market, commodity parts allow manufacturers to sell products at 
a slim profit margin, but to make a substantial profit because of volume. A 
commercial manufacturer of governmental high assurance products would require 
a substantial proht margin to justify the high engineering costs, the lack of 
commodity parts and the relatively low volumes of the government market. 

Commercially available products, even those built to best commercial prac- 
tices, typically exhibit relatively low assurance. Thus, the third goal of Sherlock 
was to determine the extent to which it was practical to improve upon best 
commercial practices to achieve higher assurance while maintaining commercial 
viability. We believe that Sherlock demonstrates that surprisingly high levels of 
assurance can be achieved with relatively small enhancements to COTS products 
and, therefore, commercial high assurance can be high enough for some, if not 
many, government uses. 



3 Sherlock Architecture 

Sherlock implements a network or thin client computing system that can connect 
to trusted information domains at various levels of classification, one after ano- 
ther. That is to say, Sherlock secure thin clients (STCs) were designed to connect 
to an information domain containing data at a certain level of classification, 
work in that domain for a certain period of time, then clear the STC’s state 
while exiting the domain. Once its state has been cleared, the STC can enter 
another domain at the same or a different level of classification, thus exhibiting 
single level at a time (SLAT) security properties. 

Above all other concerns, Sherlock must maintain information within its 
trusted domain. Sherlock achieves this with two important techniques: STC state 
clearing and encrypting and authenticating all data transferred between an STC 
and a trusted domain. 



3.1 System Overview 

Figure 1 depicts a typical Sherlock system. The system comprises three STCs 
connected to a local domain to which two trusted domains also are connected. 
Each trusted domain comprises, in turn, three trusted servers connected to the 
local domain through a security gateway (SG). For additional security the local 
domain, while not trusted, is separated by a hrewall from whatever lies “beyond,” 
perhaps an intranet or even the Internet. 

An STC is a commercial, off-the-shelf thin client augmented with a Sherlock 
security module (SM) and a new BIOS. An SG is a virtual private network 
(VPN) router that has been modified to support Internet Key Exchange (IKE) 
with user-host authentication (described below) . A trusted server is a commercial 
computer running an unmodified operating system and unmodified applications. 

A trusted domain is physically secured, independently administered and has 
its own private digital certificate authority (CA). The CA issues X.509 certifica- 
tes to authorized users and stores those certificates on users’ smart cards, along 




Sherlock: Commercial High Assurance Network Computing 



265 




with the users’ corresponding private keys and the root certificate for the CA. 
For ease of use, a given user’s smart card may contain certificates and private 
keys issued by the CAs of several trusted domains. 

A trusted domain connects to the local domain via an SG. As the TCP/IP 
protocol suite is used for communication in Sherlock systems, SGs support the 
Internet Protocol Security Architecture (IPSec). Additionally, SGs provide user- 
host authentication via a modified version of the Internet Key Exchange (IKE) 
protocol. 

The local domain is an untrusted, local area network that logically snakes 
through a campus, building or portion of a building. Although we do not go into 
detail about the function of a local domain, an STC must have an IP address 
statically assigned to it or the local domain must supply one, e.g., via DHCP 
or BOOTP. However, an STC does not download other information such as an 
operating system from the local domain, nor can other equipment on the local 
domain interact with the STC once it has begun operation. 

The local domain also has a private CA that issues X.509 certificates to 
authorized STCs. An STC’s certificate and associated private key is stored on a 
smart token resident on the STC’s SM, as is the public key associated with the 
CA’s root certificate. An STC is connected to and certified by exactly one local 
domain. 
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3.2 Secure Thin Client Architecture 

Figure 2 depicts the architecture of a secure thin client. The prototype STC 
consists of an IBM NetVista Model N2800 thin client, a Gemplus GCR500 smart 
card reader and the custom PGTbus security module. 




Fig. 2. Secure Thin Client Architecture 



NetVista Thin Client. The NetVista is a lightweight computer with the con- 
ventional personal computer architecture, powered by an Intel Pentium MMX 
processor. The Gemplus smart card reader is the interface between the SM and 
the Schlumberger’s Cyberflex Access Java-enabled smart cards that identify in- 
dividual users. The SM is a custom board built of standard parts. The SM in- 
corporates, among other things, a Dallas Semiconductor Cryptographic iButton, 
which is a smart token coupled with a secure, real-time clock. With the exception 
of the SM and the Sherlock STC BIOS, all components of the STC are commer- 
cially available, off-the-shelf. (The Sherlock STC BIOS is simply FLASHed over 
the production NetVista BIOS.) 

Security Module. The security module is a custom hardware adapter that 
plugs into the NetVista’s PCTbus expansion slot. It appears to the NetVista to 
be an Ethernet adapter. In fact, the SM forms the cornerstone of the STC in 
that it performs all essential Sherlock functions. 
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In particular, the SM verifies on startup that the NetVista has a valid 
Sherlock STC BIOS in its FLASH and that the NetVista has cleared its state. 
The SM manages the iButton, the smart card and the smart card reader. The 
SM’s other functions include retrieving and validating the user’s PIN, retrieving 
user and host certificates, generating random numbers, maintaining and acces- 
sing the iButton’s real time clock, and requesting that the smart tokens sign 
data with their private keys. 

The SM also manages and processes all network traffic for the NetVista. It 
negotiates session keys via IKE. It acts as a VPN router, in a manner similar 
to that of a security gateway. Additionally, the SM can reset the NetVista if it 
detects errors or attempted security violations. 

Security Module Hardware. Figure 3 depicts the architecture of the Sherlock 
security module. 




Fig. 3. Security Module Architecture 



The SM comprises the following subsystems: central processing, system in- 
terface, networking, smart token support and cryptographic accelerator support. 

The central processing subsystem consists of a CPU, FLASH memory and 
dynamic RAM. In the prototype, we used an Intel i960VH as the CPU; however, 
we could have used any moderate performance, compact CPU, ideally with an 
embedded PCI interface. Reasonable alternatives include several members of the 
PowerPC and StrongARM families. 
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The system interface subsystem, an embedded PCI-bus bridge, connects the 
CPU’s local bus and the NetVista’s PCI bus. We chose the Cypress AN3042 for 
this job for four reasons. First, the AN3042 is a full-fledged, embedded bridge 
capable of issuing all PCI bus cycles. Second, it shares 16 kilobytes of dual- 
ported RAM between the local (CPU) and PCI buses. Third, it does not allow 
accesses from the PCI bus to the local bus. Fourth, it supports local buses for a 
variety of CPUs. 

The networking subsystem consists of an Intel 82559 Ethernet chip and a 
few analog components. We chose this chip because it integrates a lO/IOOMbps 
Ethernet media access control and a physical layer in a very compact package. 
We located the Ethernet subsystem on the CPU’s private PCI bus to isolate it 
from the other subsystems. The CPU’s PCI-bus interface provides a hardware 
guarantee that the Ethernet subsystem can access only a specific four kilobyte 
region of the SM’s DRAM. 

The smart token subsystem includes a DUART for simultaneous, serial access 
to an external, off-the-shelf smart card reader, the Gemplus GCR500 and to the 
iButton mounted on the SM. The connection to the iButton is through a Dallas 
Semiconductor 2480 serial-to-“I-wire” interface chip. The smart card reader has 
an integral display and keypad and connects directly to the SM via an RS-232 
serial cable, thus providing a trusted path to the SM. 

Finally, the cryptographic accelerator support subsystem is an optional com- 
ponent that consists of a riser card extending the SM’s local bus to a second, 
private PCI bus. The riser card contains an AN3042 embedded bridge chip for 
this purpose. We designed the SM to support virtually any PCI-bus crypto- 
graphic accelerator, of which many are available, supporting governmental or 
commercial algorithms. 

Fundamental to the SM’s design is that the CPU’s PCI-bus interface and 
the two embedded PCI-bus bridges each act as hardware firewalls to the central 
processing subsystem, thus guaranteeing that the SM’s software or data neither 
can be read nor changed using conventional means. 

We chose not to implement hardware tamper resistance on the SM for three 
reasons. First, the smart card and iButton provide a certain degree of tamper 
resistance, in particular protecting their private signing keys. Second, we an- 
ticipate the STCs typically will be deployed within physically secure facilities 
or, if not, will be restricted to accessing trusted domains having relatively low 
classification levels. Third, the IBM 4758 PCI Cryptographic adapter, the only 
commercially available cryptographic module to have achieved FIPS 140-1 Le- 
vel 4 certification, demonstrates that IBM is capable of building modules with 
highly sophisticated hardware and software tamper resistance. 

Security Module Software. All of the SM’s software was either derived from 
open source materials or written by the Sherlock team. We used open source 
materials because we found that, under this venue, a great deal of high quality, 
unencumbered software with full source code was freely available. Moreover, we 
found that support was widely available and highly responsive, both commerci- 
ally and from the open source community. 

We used the Real Time Executive for Multiprocessor Systems (RTEMS), 
developed by Online Applications Research Corp. for the U.S. Army’s Redstone 
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Missile Command. RTEMS is a mature, fully featured, embedded 0/S with 
roots in military systems and, as a result, is highly reliable and was carefully 
written. We used a board support package graciously provided by Gerwin Pfab, 
in support of the i960VH. We ported the IPSec and IKE packages from the 
OpenBSD project to RTEMS, making changes where necessary and modifying 
them to use an IBM-written cryptographic library. We also ported smart card 
software written by Gemplus. 

In addition to the ported and modified code, we wrote Sherlock-specific code 
for the following: a smart card application subsystem, an iButton application 
subsystem, device drivers for the smart card reader and iButton controller, a 
device driver for the Ethernet chip and the two embedded bridges, a virtual 
router, software to verify that the NetVista has cleared its state, an overall 
application shell, software to interact with the NetVista’s BIOS and software to 
download an operating system loader to the NetVista. 



3.3 Bootstrapping the Secure Thin Client 

Before an STG can connect to a trusted information domain, its SM must ensure 
that the STG is starting in a clean state and that the user is authenticated and 
is authorized to make the connection. 



Thin Client State Clearing. When a user asks to switch from one trusted 
domain to another, the SM first ensures that any “residue” from the first domain 
is cleared from the STG, before allowing the STG to enter the second domain. 
This ensures that the NetVista cannot transfer information between domains, 
thus preventing cross-domain data “leakage.” 

State clearing is implemented by returning the NetVista to a known confi- 
guration, i.e., the one that it had when it booted for the hrst time with its SM 
installed. To achieve this, the SM clears the STG’s memory, I/O space and PGI- 
bus configuration space. The SM clears all of memory except for the lowest two 
kilobytes, which contain the Sherlock STG’s BIOS data and stack. This region 
is wiped clear by having the SM write known values into it. The values are those 
that were recorded after the SM was first installed into the NetVista and the 
NetVista reached a certain point in its boot process. Re-writing these values into 
the NetVista’s memory forces the NetVista back into its initial state. 

Having the SM clear memory by writing into it, instead of having the STG 
write into memory and having the SM verify the values in memory, guarantees 
that the NetVista’s Level 1 and Level 2 caches have been flushed and invalidated, 
i.e., they retain no data from the first domain. 

I/O space is a historical remnant of the original IBM PG. Unfortunately, 
clearing it is complex, as various operations applied to various locations can 
have (by design) complex side effects. Accessing a given location with differently 
sized operands can have drastically different side effects. For example, issuing a 
one-byte I/O read to a given address resets the NetVista, whereas issuing a two- 
byte I/O read to the same address returns the value in a certain configuration 
register. 
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Clearing I/O space arguably is the most complex task in clearing the Net- 
Vista’s state. In general, our technique has been to read carefully the documen- 
tation for the NetVista’s various components, to determine which areas of I/O 
space are used for what purposes and to determine, if possible, whether a default 
value can be written into those areas. This method covers the vast majority of 
cases. As I/O space is limited to 64 kilobytes, “safe” values to write into the un- 
documented or inadequately documented locations may be determined by trial 
and error. Thus, clearing the I/O space becomes nothing more than writing, 
from a table stored in the SM, known values into known locations. Of course, 
the negative side to this method is that, whenever the underlying hardware is 
changed, e.g., when a new model thin client is to be supported, this table must 
be computed again, largely by hand. 

Clearing PCI configuration space is relatively easy as it comprises at most 
256 bytes per PCI device and as there are no legacy issues of concern. PCI 
configuration space may be cleared by writing zeroes into it. 

As an additional security check while booting, the SM verifies the digital 
signature of the BIOS image installed on the NetVista. (The boot image shonld 
have been signed by the local domain’s CA.) 

Once the SM has determined that the NetVista’s state has been cleared and 
that its firmware has not been tampered with, the SM downloads an operating 
system loader to the NetVista. After authentication, detailed in the next subs- 
ections, the loader loads the NetVista’s 0/S via the “Ethernet” (actually, the 
SM, acting as an Ethernet adapter) and the NetVista starts the 0/S. 



Smart Token Authentication. Next, the SM prompts the user, via the smart 
card reader display, to insert their card into the reader. When the card is in the 
reader, the SM prompts the user to enter their PIN. If the smart card validates 
the PIN, the SM prompts the user to choose the information domain to which 
they desire to login. 

All data exchanged between the smart card reader and the SM over the serial 
cable are encrypted. The SM supports this by facilitating the iButton and smart 
card to exchange public key information and thereby to create keying material 
for subseqnent encryption. 

Once login is complete, the smart card reader displays the name and classi- 
fication level of the information domain. The NetVista’s keyboard, mouse and 
monitor are not used in the login process. 



3.4 Securing Communication 

At this point, the NetVista is ready to download the 0/S and to access appli- 
cations and data from the trusted domain. The link between the local domain 
and the trusted domain may pass through untrusted data lines. To protect in- 
formation that may be sent over such lines, several security mechanisms are 
used. 




Sherlock: Commercial High Assurance Network Computing 271 



Virtual Private Networking. Rather than set up costly and complex private 
networks from STCs to trusted domains, Sherlock implements an IPSec-based 
virtual private network. 

All traffic from STCs to trusted domains and vice versa is digitally hashed 
and encrypted. IPSec’s AH hashing is used to authenticate the sender to the 
receiver and IPSec’s ESP encryption keeps the data and information about the 
connection private. These functions are performed by the SM, on behalf of the 
NetVista, at one end of a connection and by the security gateway, on behalf of 
the trusted domain, at the other. Packets are decrypted and verified only when 
they have reached the end of the connection. Commercial algorithms, e.g.. Triple 
DES and SHA-1, were used in the prototype, but other algorithms, especially 
governmental ones, readily could be substituted. In addition to supporting basic 
VPN function, both ends of a connection perform packet filtering. 

Keying material for IPSec is generated via the Internet Key Exchange (IKE) 
protocol, which implements the Internet Security Association Key Management 
Protocol (ISAKMP), corresponding to IKE Phase I and Oakley, corresponding 
to IKE Phase II. 

Sherlock uses X.509 digital certificates, described in detail in the next subs- 
ections, in IKE exchanges. Unlike standard IKE, for reasons explained below, 
Sherlock’s IKE implementation requires two certificates: one for the user and one 
for the STC. Moreover, Sherlock’s IKE implements what we call “double-perfect 
forward secrecy,” i.e., it uses no IKE Phase I keying material in performing an 
IKE Phase II exchange. 



X.509 Digital Certificates. The smart card and iButton together provide 
the SM’s cryptographic functions. These smart tokens are capable of performing 
various symmetric and public key cryptographic algorithms including Triple DES 
and RSA, respectively, and support the addition of custom JavaCard applets. 

Each user has a smart card that provides a “what you have and what you 
know” capability. That is, the user must present both the smart card and a PIN 
to enable it before being accepted by the SM. Eor each trusted domain that 
the user may access, the smart card holds the user’s X.509 certificate and the 
private key associated with the certificate. Also, the smart card holds the root 
certificate for the trusted domain. 

Each SM has an iButton that holds the SM’s host certificate, the private key 
associated with the certificate and the root certificate for the local domain with 
which the SM (and hence the STC) is associated. The iButton is loaded with 
these data when the STC is assigned to the local domain. 



Authentication with X.509 Certificates. Certificate validation involves the 
following tests. 

— Has the certificate expired? 

— Has the certificate been revoked? 

— Did the CA that it claims sign the certificate? 

— Does the certificate presenter also hold the private key? 
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When setting up an IPSec connection, IKE running at a security gateway 
validates the user’s certificate by performing the above tests. Since the CA that 
would have issued the certificate is located within the trusted domain, the SG is 
able to retrieve the root certificate and the certificate revocation list (CRL) for 
the CA directly from a server within the domain. Checking the expiry date and 
the CRL for the certificate’s serial number is easily performed, as is verifying 
that the trusted domain’s CA signed the certificate. The IKE protocol verifies 
that the certificate presenter is also in possession of the private key. 

IKE with User-Host Authentication. As mentioned above in the Sherlock 
architecture, both the user and the STC on which the user is working are jointly 
authenticated. Although the IKE specification mentions off-handedly that mul- 
tiple certificates may be included in a certificate payload, it does not specify how 
these are to be interpreted, how to create or interpret the multiple identities or 
multiple signatures that are required for dealing with multiple certificates, much 
less does it describe how to deal with certificates issued by multiple CAs. 

We extended IKE to support these so that Sherlock could implement user- 
host authentication. In this subsection, we describe how we form the certificate 
payload, the identifier payload, and the signature payload for Sherlock’s version 
of IKE. 

We form the certificate payload simply by concatenating the user and host 
certificates. This payload is supported by the ASN.l definition of certificate 
payloads used in the IKE specification. 

We form the identifier payload by interpreting the user and host identities as 
if they were fully qualified domain names (FQDNs), and composing them into 
a user fully qualified domain name (UFQDN). This format is also supported by 
the IKE specification, as the identifier payload has no specific interpretation, 
other than that it must be unique with respect to its CA. 

Consider, for example, a user named “fred” by the “blackcat” domain, logging 
into that domain from a host named “host26” by the “local” domain. In this 
example, Sherlock assigns the FQDN “fred. blackcat” to the user, the FQDN 
“host26. local” to the host and the UFQDN “fred. blackcat@host26. local” to the 
{user, host} combination. The UFQDN is used as the IKE identifier payload. 

Other methods for composing an identifier may, of course, be used. We chose 
a UFQDN interpretation for its simplicity of design and ease of debugging, as 
code already existed to decode identities as FQDNs and UFQDNs. 

We form the signature payload by concatenating what would be the user and 
host signatures, i.e., the signatures that would be formed if the user and the host 
had individually signed the IKE messages. As with standard IKE, we compute 
the user signature by applying a pseudo-random function, e.g., MD5, to the 
(identifier, cookie} combination, then encrypt the result with the user’s private 
key. We compute the host signature similarly, except that the host’s private key 
is used instead of the user’s private key. Of course, we use the UFQDN as the 
identifier. 

Multiple Certificate Authorities. Sherlock’s user-host authentication design 
point arose from our concern that a system employing a single CA is as suscep- 
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tible to compromise as is the CA. Very little reliable information is available 
about the security of CAs. An important rule of high assurance-governmental 
or commercial-is to avoid any single point of security failure. Thus, Sherlock 
relies on joint authentication by two CAs operating independently, i.e., with- 
out any form of trust relationship. To compromise Sherlock’s authentication 
function, then, both CAs would have to be compromised independently and 
simultaneously. 

Sherlock’s dual-CA approach could be generalized to multiple CAs, although 
it is unclear that doing so would provide substantially more security than the 
dual-CA approach. 



Authentication Policy. In addition to joint authentication, Sherlock uses so- 
phisticated policy techniques for increased assurance. As described above, a user 
attempting to access a trusted domain must be authenticated by that domain 
and the STC from which they are attempting access must be authenticated by 
the STC’s local domain. Additionally, the trusted domain must have a policy 
allowing access from the local domain and the local domain must have a policy 
allowing access to the trusted domain. 

Determining and enforcing policy is an area in which the Sherlock prototype 
is incomplete. Ideally, policy enforcement should be split among the security 
gateway, the local domain, the SM’s iButton and the user’s smart card and that, 
in fact, is described by the Sherlock architecture. Actually specifying policy in 
a meaningful way without a sophisticated user interface is a daunting task and 
one that we did not complete. Additionally, implementing distributed policy 
enforcement in a manner that can be debugged readily is not a trivial task. 

4 Preliminary Results 

The feasibility of the Sherlock approach has been established by the demon- 
stration of a testbed now working in a government facility. The hardware has 
been fabricated successfully and integrated, and is undergoing extensive testing. 
The software has also proven itself and is running on the hardware. Testing and 
fine-tuning is underway. 



5 Conclusions 

Where at one time it was sufficient to have one network to which everyone in a 
particular agency or organization had full access, it is now more common in the 
Intelligence Community to have a variety of physically separate, independently 
managed networks with differing classification levels. Users often need to access 
several of such networks on a daily basis, generally from different computers. 
Sherlock replaces the multiple computers with a single, secure, COTS thin client. 

We defined the concept “commercial high assurance” as the highest assurance 
that can be achieved while maintaining commercial viability. We found that, with 
certain carefully chosen enhancements, commercial products can achieve high 
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levels of assurance and can be used in some situations traditionally reserved 
for government designed devices. In the process, we designed and prototyped a 
system for secure networking. 

Sherlock demonstrates that commercial high assurance can be high enough 
for some, if not many, government uses. 
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Appendix: Software Incorporated in the Prototype 

We incorporated the following open source and commercial software in the 

Sherlock prototype: 

Etherboot: Etherboot is a program that downloads an operating system for a 
thin client. It handles both the BOOTP exchange and the TFTP download 
of the 0/S image. After the STC has completed clearing its state and the 
SM has verified that the state has been cleared, the SM copies Etherboot 
into the NetVista’s memory and the NetVista executes Etherboot. The only 
modification we made to it was the addition of a custom driver that makes 
the SM look like an Ethernet adapter. 

General Software Embedded BIOS: The BIOS image on the NetVista is a 
slightly modified version of General Software’s Embedded BIOS. For security 
reasons, support for most of the NetVista’s I/O devices, e.g., serial, parallel 
and on-board Ethernet, was removed and support for state clearing and for 
accessing the SM was added. 

OpenBSD: In the prototype, the security gateways are PC clones running a 
version of OpenBSD that was modified to support user-host authentication. 
In addition, much of the IPSec and IKE software that was added to the SM 
was ported from OpenBSD. 

Real Time Executive for Multiprocessor Systems: Developed by Online 
Applications Research Corp. for the U.S. Army’s Redstone Missile Com- 
mand. RTEMS is a mature, fully featured, embedded 0/S with roots in 
military systems and, as a result, is highly reliable. 

Red Hat Liuux: Prototype STCs run a scaled down Red Hat Linux distribu- 
tion, which they download from the trusted domain once they have been 
allowed in. Additionally, in the prototype each trusted domain contains an 
NFS server that exports the STC’s file systems as NFS mounts. This server 
runs a full scale Red Hat Linux system. In a production implementation, 
Microsoft’s CIFS could serve as an alternative protocol for STC file system 



access. 
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WindowsNT Server/TSE and Citrix Metaframe: When a secure domain 
contains a WindowsNT server running Terminal Server Edition, the Citrix 
Metaframe product allows users to run WindowsNT applications on trusted 
servers as if they were running locally, i.e., on an STC and displaying the 
Windows user interface. No modifications were made to TSE or Metaframe 
for the Sherlock system. 
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Abstract. Increasing numbers of computer security vulnerabilities 
mean that, more than ever before, internetworked computers are at risk 
from attack. Unfortunately research to date has not found suitable solu- 
tions to these problems and therefore further work is required in order 
to understand what is necessary to develop secure systems. This study 
sought to explore the relationship between the development process and 
the security of the fielded system. Specifically an attempt was made to 
analyse the “real-world” security of three modern Unix systems and this 
was compared with the consideration of security during their develop- 
ment. The results not only show that a consideration of security at all 
phases of development leads to significantly more secure products, but 
also indicates the specific roles that each development phase plays in this 
process. 



1 Introduction 

In recent years the role networking and the Internet have played in the usage of 
computer systems has increased dramatically. Both home and business users are, 
more and more, going “online” to conduct their daily affairs in areas ranging from 
simple communication of ideas, to buying and selling. In accord with this trend, 
software vendors are adding Internet-related features to their products, even for 
some packages which previously had little to do with networking. However, this 
move online has meant that not only are there more opportunities for computer 
systems to exhibit security flaws, but these systems are now more exposed to 
those who would seek to exploit such flaws. Evidence of this can be seen in the 
increasing numbers of security problems being discovered and reports of system 
break-ins [16]. It appears that software vendors have not kept up with the range 
of threats to which their newly network-enabled software is exposed. 

Clearly it is important that a solution be found to prevent or nullify these 
software flaws. One solution may be to tackle the problem head-on and at- 
tempt to reduce software security faults by improving the development process 
itself. Unfortunately it appears that developing secure software is difficult, time- 
consuming and thus often costly [11], [15], [1]. This raises questions as to what 
role the development process plays in determining the security of the fielded 
system. In particular, what forms and aspects of security-related work during 
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development lead to better security in the resulting systems? Additionally, what 
can be done during development to improve security and at what cost? 

This paper reviews research that took three modern Unix-based systems and 
attempted to semi-quantitatively assess their security through an examination 
of both their secnrity features and security failings. Then, for each system, the 
consideration of security during each phase of the development process was clo- 
sely studied. Correlations and relationships were then explored between both 
the amount and nature of the security work done during development and the 
apparent security of the released product. From these a number of interesting 
conclusions and observations were drawn concerning optimising the development 
process for security. For full details of this work, refer to [12]. 

2 Background and Related Work 

Incidence of computer break-ins and other security problems appears to be on 
the rise. Traffic on security-related mailing lists is increasing and security issues 
are starting to regularly make headlines in both the printed and online media. 
Statistics released earlier this year show that, not only are there large numbers 
of security problems being reported, but this number is also increasing steadily 
[16]. Such statistics raise the question, what work has and is being done by 
researchers and software vendors alike to improve the situation? 



2.1 Formal Methods Research 

Just as networking technology today has increased the exposure of computers to 
potential security threats, so too in the 1960s, the advent of time-sharing systems 
raised specific concerns regarding the interactions between multiple users (and 
their data) these made possible. Particularly evident among the US military 
and government agencies, these concerns spawned research with the objective of 
developing models and methods for analysing system security and, eventnally, 
developing “verifiably secure” systems [7], [14, Chapt. 2]. 

At the centre of this research was the Bell & LaPadula model [3] which 
provided a formal expression of multi-level (confidentiality-based) security. The 
model stimulated a great deal of other related research and was widely regar- 
ded as providing the theoretical basis on which subsequent research could build 
in eventually obtaining provably secure systems. This model also provided the 
basis for the development of the US Trusted Computer System Evaluation Cri- 
teria (TCSEC) or “Orange Book” [17]. TCSEC defines security classifications 
and criteria for systems with the intention being that vendors could develop 
their systems with compliance in mind. A formal evaluation process could then 
be undertaken with the successful outcome being that the system would be 
pronounced as officially certihed at a given seeurity rating. These standardised 
classifications would supposedly provide buyers (particularly those in military or 
government circles) with an effective guarantee of the degree of security afforded 
by the system. Such was the confidence in this scheme and in the theoretical 
research behind it that Patrick R. Gallagher Jr., the then Director of the US 
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National Computer Security Center, stated in the foreword to one of the stan- 
dards documents [10] that “[TPEP]^, will result in the fulfilment of our country’s 
computer security requirements.” 

However, the results do not appear to have met these early expectations. 
Instead, as noted previously, security vulnerabilities are steadily increasing. Some 
might suggest the cause of this is that, today, there is not widespread acceptance 
of the security standards created.^ Very few contemporary systems are designed 
with compliance in mind and fewer still successfully complete evaluation. The 
possible reasons for this are numerous. For one, there has not been widespread 
consumer acceptance of, and often even interest in, these government-endorsed 
standards. The standards (particularly TCSEC) have been widely criticised for 
a variety of reasons such as being too secrecy-centric and tailored to military- 
style requirements [7], [13], [14]. To many buyers of computer systems “research 
into specification methods for the analysis and design of information systems 
security often seems esoteric and remote” [2] . With this lack of consumer interest 
in such systems, the corresponding vendor disinterest is unsurprising. Vendors 
are unlikely to implement any feature that customers don’t care about, let alone 
something like security that requires significant effort and cost. People want 
better, faster, more feature-packed systems; attributes like security are often too 
intangible to be marketable. On the other hand, the security evaluation process 
is a relatively slow one and it typically involves the consideration of a static 
development state of the system concerned. Peter Neumann observes [11]: 

System purveyors have not been sufficiently eager to develop [multi-level 
secure] systems; the evaluation process seems to have hindered progress 
because of its complexity and the long delays involved in the process, and 
because the systems are generally moving targets that undergo continuing 
improvements. Overall, formal methods have had only limited success in 
being applied to the development of widely used commercial systems. 

Despite the criticisms of formal method-style approaches, and the models and 
evaluation criteria they have produced, it would be wrong to label this past 
research invalid or useless. Instead it seems much of this work was concerned with 
theoretical models and the resulting designs which, on their own, cannot solve 
computer security problems. In criticising the “Basic Security Theorem” of the 
Bell & LaPadula model, one researcher noted that those who regard this model 
as a foundation for developing secure systems have “confused the theorem with 
the non-trivial task of proving that an implementation meets the conditions of a 
given security definition” [8]. This task is common to both compliant “trusted” 
systems and the conventional commercial systems that are in widespread use. 
Computer security is not a simple problem that can be solved in a single stroke. 
Instead it is a difficult, complex and subtle issue that requires consideration at 
all levels of the development process in order to be properly dealt with [11], [15]. 
It is on this basis that the following research is presented. 

^ Trusted Product Evaluation Program, the programme for evaluating the compliance 
of a system with the guidelines given in TCSEC. 

^ This includes TCSEC and its “relatives” such as ITSEC and the Common Criteria. 
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2.2 Security and the Development Process 

Given that the process of designing and implementing secure software is plainly 
a difficult one, one might ask: what is the relationship between this and the soft- 
ware development process itself? This process has been well-studied with many 
different methodologies being proposed. For example, the well-known 'Vater- 
fall” or “life-cycle” model involves splitting the process into different phases 
(typically analysis, design, implementation and maintenance), each dealing with 
a different stage of development [19]. Despite the obvious fact that the security 
of a piece of software is ultimately connected with the way it was developed^, 
few studies have even begun to explore the relationship between security and the 
development process and, those that do, tend to be limited to only one aspect or 
development phase [9], [4]. However, any truly effective approach must compre- 
hensively cover all development phases and consider “how systems are conceived, 
designed, developed, modelled, analyzed and used” [11]. The following discusses 
a study which analysed the development approaches taken in three modern Unix 
systems and related this to the security that each has exhibited. 

3 Research Objectives and Design 

As already stated the objective of this research was to explore the relationships 
between system security and the consideration of security during development 
(both the amount of work undertaken and its nature). In particular it was impor- 
tant to establish what specific security-related activities were being performed 
at a given development stage and attempt to analyse what effect these had on 
the product’s security. Identifying the effectiveness of such activities would al- 
low developers to determine whether the relative cost of each would warrant its 
inclusion in the development of a given product. Also of interest were practical 
relationships between different aspects of security and how these influence the 
system’s overall security. 

In general the goal of this study was to show that a comprehensive approach 
to security, where security issues are examined and addressed at each stage of the 
development life cycle, results in systems that have more (and superior) security 
features, exhibit less vulnerabilities and are thus less susceptible to compromise. 



3.1 Overall Research Design 

This research involved performing independent analyses of both the development 
approach and the security for each of the three systems. The three systems chosen 
were Debian GNU/Linux 2.1, Sun Microsystems Solaris 7 and OpenBSD 2.5. At 
the time the work was conducted, these were the latest versions. The three 
were chosen on the basis of popularity (and thus general relevance), publicised 
consideration of security, openness of the development work (thus aiding and 
facilitating detailed research) and availability of software. Debian represents a 
widely available distribution of the increasingly popular free Unix-like GNU 

While the way it is used also plays a role, this is outside the scope of this study. 
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system using the Linux kernel; some estimates put GNU/Linux-based systems 
as the most popular server systems on the Internet [20]. Solaris was chosen 
given that it is currently the most popular commercial Unix variant in use on 
the Internet. It is also a fairly typical System V-based Unix system. OpenBSD 
was chosen based upon its strong reputation as a security-oriented system. The 
study was restricted to Unix-based systems in order to maintain consistency 
for comparison purposes (given that all Unix systems tend to have very similar 
security models) and because the author’s expertise lay with such systems. 



3.2 Analysing the Development Approach 

Although the projects^ studied did not necessarily employ the “waterfall” or 
“life-cycle” methodology, it is believed this is still a legitimate model to use 
in analysing their development. In particular, viewing the development process 
as a series of largely sequential phases (analysis, design, implementation and 
maintenance) provides structure and context to specific development activities 
and allows a more detailed view to be obtained, even though the order and 
organisation of these activities may not necessarily have conformed with this 
methodology. As such, the different phases were used as a way of modelling and 
interpreting the development process as opposed to a methodology for conduc- 
ting this process. 

Despite utilising such a model, gaining a detailed insight into the develop- 
ment process is not simple in practice. For the analysis phase, the project’s ob- 
jectives and requirements (both stated and implicit where possible) were deter- 
mined. Also considered were the target market and established user-base of the 
system. Considering the design phase involved examining the system’s functio- 
nality where this related to security. This involved not only the security features 
of the system but also other issues such as default configurations which showed 
security-conscious planning and forethought. Any security problems exhibited 
that were the result of a blatant design flaw were also considered. Implementa- 
tion was examined by the presence or absence of any particular “coding policy” 
adhered to by the project. Any background or education provided by the project 
for its developers regarding secure programming techniques was considered, as 
was the degree of emphasis on code quality. Security flaws relating to the sy- 
stem which were the result of blatant implementation errors were also included. 
Finally assessing maintenance involved studying project testing and auditing 
procedures. Did the work involve some sort of ongoing commitment to main- 
taining and improving the quality and security of the release or was the focus 
largely on adding features to the next release? Management of security updates, 
advisories and upgrades also gave insight into work conducted by developers on 
security during the maintenance phase. 

The sources of information for this assessment were numerous. Frequently 
the project would have a web site (or sometimes several, dealing with different 
aspects of the system) containing often extensive documentation regarding both 

^ For the sake of convenience, the word “project” will be used throughout to refer 
both to volunteer software projects and commercial products. 
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the system’s operation, design and development. All three systems were installed 
and tested and the online documentation provided was reviewed. Interviews 
were also utilised to attempt to get an insight into the views of the developers 
themselves. 



3.3 Development of the Security Analysis Metric 

Quantifying security in a general way is a difficult thing to do. Security has 
many subtleties and exceptions and establishing general rules that correctly 
deal with all of these is probably impossible. Previous work has tended to focus 
on establishing evaluation criteria in order to place systems into appropriate 
security classifications. However, such ordinal scales often provide insufficient 
detail. For example, it appears that two of the systems studied would fall into 
the same classification despite obvious differences in their levels of security. One 
previous study [18] attempts to provide a framework for measuring security, 
however, these guidelines are not yet easily generalised and applied. Despite 
this, the metric developed here builds upon many of the concepts given in [18], 
in addition to several well-established security constructs discussed in [13], [17], 
[5] and [14]. Unfortunately, due to limited space, full details of the metric’s 
development cannot be presented here. For a comprehensive discussion of the 
security analysis metric used refer to [12]. 



Metric Requirements. Several requirements for the metric were established. 
Among other things it was required that the metric not exhibit significantly bias 
in the event that an analyst had differing levels of familiarity with one or more 
of the systems being considered. The results provided must not be opaque to 
further analysis, criticism and adjustment while not being so detailed as to be 
unwieldy. It should allow sensible and meaningful “real world” security compari- 
sons between systems, without necessarily providing independently quantitative 
results. The purpose of the metric was only to facilitate comparison between 
effective security and development approach and not to provide security bench- 
marks. Consequently the metric needed to be simple and easy to apply and not 
require extensive verification, validation or calibration before use. 

Finally the metric needed to be consistent with observations and guidelines 
given in past research. In harmony with this, the metric developed centred on 
four “dimensions” of security, namely confidentiality, integrity, availability and 
audit. These were adapted from constructs and ideas discussed in [18], [6], [14], 
[17], [13]. Utilising these four classifications for different components or aspects 
of security significantly aided the structuring of the security analysis process. 



Features versus Vulnerabilities. In order to attempt to measure or assess 
security, we separated this construct into two key parts. Firstly the security 
features that the system contains, for example cryptographic libraries, authen- 
tication systems, access control lists and secure networking tools. Secondly, the 
security of a system is clearly dependent on how vulnerable it is to compromise. 
Systems that exhibit large numbers of vulnerabilities are clearly less secure, even 
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when vendors provide patches within a relatively short space of time. Many ad- 
ministrators fail to apply such patches whether because they are negligent or too 
busy with other apparently more urgent matters. Since vendor “security adviso- 
ries” are often released through a limited set of channels (for example, security 
mailing lists to which most system administrators probably aren’t subscribed) 
then they may not even be aware of the problem and many installations will 
remain vulnerable long after a problem has been fixed®. Furthermore, the exi- 
stence of one vulnerability in a system very often suggests the presence of other, 
as yet undiscovered, problems. At the same time, security features such as fir- 
ewalls, strong authentication systems and intrusion detection schemes can often 
significantly assist in repelling such attacks. Thus the combination of security 
features (positive) and recorded security vulnerabilities (negative) represents a 
reasonable method for estimating the degree of security provided by a system®. 



Quantifying the Value of Features. Even splitting the property of “secu- 
rity” into features and vulnerabilities does not make measuring it easy; further 
divisions need to be made. Firstly, each feature is given a score out of 10, half for 
the effectivemess of the feature and half for the feature’s importance. Effectiven- 
ess is determined by how successful the feature is in tackling the specific goal or 
problem for which it was designed. A security feature should be rated solely on 
this and should not rate poorly because it does not assist in solving other, unre- 
lated security issues. Situations where scoring effectiveness can be problematic 
include when a feature provides a key element of protection but is largely useless 
if certain other measures are not used in conjunction with it. Alternatively the 
feature might have the potential to be very valuable but only if it is configured 
correctly and the configuration process may require knowledge or skills that not 
necessarily all administrators or users would be likely to possess. There are not 
any universal answers to these questions and the analyst is required to carefully 
consider the feature in question before rating its effectiveness. The importance 
component is determined by how critical the issue the feature deals with is in 
relation to the system’s overall security. 

Upon assigning the value out of 10 for a feature, it is classified into one or 
more of the four security dimensions given previously. In addition to assigning a 
numeric value to the feature, the analyst must also fully discuss and document 
it, explaining the rationale behind its inclusion, classification and score. Once 
all the features for a given system have been rated, the scores are totalled up 
separately for each dimension and each of the four totals is divided by the 

® It has also been suggested that attackers could easily catalogue the configurations 
of systems that they “probe” and, within minutes of the announcement of a new 
security problem, have privileged access on many hundreds of vulnerable machi- 
nes. Under such circumstances even the most diligent administrators would be hard 
pressed to defend their systems. 

® This concept is somewhat analogous to that of describing the amount of “trust” pro- 
vided by a system based upon the security functionality it provides and the assurance 
that the functionality can correctly enforce security policy and is a fundamental tenet 
of other security assessment criteria [13], [14, Chapt. 6], [17]. 
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number of features which were classified into that dimension giving an overall 
value (arithmetic mean). 

Note that some features were deliberately omitted from the analysis. These 
included features whose purpose was purely to prevent the creation of a security 
hole, those whose primary purpose was not security but merely had some ad- 
ditional security “side-effect” and features which existed in identical form in all 
three systems considered. Optional packages were also typically omitted unless 
it was judged that these were very likely to be installed on most systems. 



Quantifying the Vulnerabilities. A similar process was undertaken for the 
system’s vulnerabilities. A score out of 10 was assigned for each vulnerability, 
this being broken down into extent of impact (rated out of 5), ease of exploitation 
(out of 4) and difficulty of solution (out of 1). The extent of impact refers to 
the potential damage the vulnerability could cause. Ease of exploitation referred 
to special conditions that would impact on the potential for an attacker to suc- 
cessfully exploit the vulnerability. This included access requirements (local shell 
access versus remote anonymous access etc.), difficulty in establishing conditi- 
ons to make exploitation either possible or more favourable and the existence 
of software which would make exploiting the vulnerability significantly easier. 
Difficulty of solution referred to the effort required to eliminate the vulnerabi- 
lity ranging from a minor configuration adjustment, to the installation of a new 
version or patch, to complex or manual patching which might be required if the 
vendor did not provide a solution. 

Upon assigning a score, the vulnerability was classified into one or more of 
the security dimensions specified. The total score for each of the four dimensions 
was then divided by the number of vulnerabilities in the dimension to give a 
mean value for this. As with the security feature analysis, the analyst is required 
to document fully the inclusion, classification and ratings for each vulnerability. 



Overall Results. An overall result for each system was then obtained by sub- 
tracting the average score for vulnerabilities from the average for features. This 
was performed for each dimension giving four overall results which were averaged 
to give a final unsealed score. The unsealed score represents a weighting of how 
good the operating system’s features were against how bad its vulnerabilities 
were. To also consider the number of each of these, a scaling factor is introduced 
which is a ratio of features to vulnerabilities. For consistency if the unsealed 
value was negative then it was divided by this scaling factor, otherwise it was 
multiplied. The result is a quantitative estimate of the system’s security and was 
used as the overall method for comparing the systems. 



Discussion of Metric Design and Validity. It is important to remember 
that the purpose of the metric was purely to facilitate a comparison between the 
security of the systems studied and the consideration of security during deve- 
lopment. For such a purpose, the metric performs well. The use of an interval 
scale provides both detailed and quantitative information not available by any 
other means. Prominent use of averages reduces the impact of biases that may 
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exist, for example if the analyst were more familiar with one system or if one 
system was more widely used and thus had more reported vulnerabilities. Furt- 
hermore the layered approach of the metric, together with the requirement for 
full documentation, allowed the results to be inspected at varying levels of detail 
preventing single figure results from being opaque to further scrutiny^. Despite 
these advantages the metric has problems, primarily that it relies on a value 
judgement from the analyst which limits the external validity of the results. Ho- 
wever, the internal validity of the results still appears to be intact and, since 
the results from the metric were only required to be used within the context of 
this study, it is not believed that this presents a significant problem. For a full 
discussion of the metric’s design, development and performance, refer to [12]. 

It is believed that this metric could be utilised in future studies if a prac- 
tical measure of an operating system’s security were required. However, unless 
the metric were substantially modified (for example, by having several analysts 
comparing multiple releases of each system), the results would still lack external 
validity. Further research to develop externally valid security metrics could prove 
worthwhile if used to better understand the construct of security itself. However, 
the author believes that researchers should beware attempting to develop secu- 
rity “benchmarks” for systems, the results of which could easily be misused by 
vendors to market their products. Security is, and likely will always be, a very 
difficult construct to quantify. 



4 Discussion of Results 

4.1 Security Analysis Metric Results 

A broad range of both features and vulnerabilities were considered in analysing 
the security of the three systems. Table 1 summarises the overall distribution 
of these®. The total numbers of features and vulnerabilities for each system 
(“Uncategorised Total”) is given and the way these were categorised into the 
four specified security dimensions is also shown. Note that, since a feature or 
vulnerability could fall into more than one dimension, the totals for the four 
dimensions will not add up to the uncategorised total. 

The results show that OpenBSD had the most security features, followed by 
Debian and then Solaris. Also of note was a general emphasis on confidentiality 
and integrity features with relatively few availability or audit features. It is belie- 
ved there are several reasons for this trend. Firstly, it is widely accepted that the 
main security problem with Unix systems is the concentration of privilege with 
the superuser [6]. While early Unix systems were designed to work in an envi- 
ronment to facilitate information sharing amongst a small, isolated, tightly-knit 
group of users, for which such a design decision is reasonably valid, today with 
their use as Internet servers this is no longer the case. As a result, protection of 

^ In particular, the multi-layered approach is consistent with the nature of security 
where details and subtleties are often crucial. 

® Due to space limitations only a minimal set of results have been given here. As 
before, for the full set of figures, raw results and documentation refer to [12]. 
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Table 1. Numbers of Features and Vulnerabilities 



Categorised Features 


Debian 


Solaris 


OpenBSD 


Confidentiality 


5 


6 


10 


Integrity 


8 


6 


12 


Availability 


3 


2 


2 


Audit 


5 


3 


4 


Uncategorised Total 


15 


11 


18 


Categorised Vulnerabilities 








Confidentiality 


6 


15 


1 


Integrity 


5 


20 


4 


Availability 


5 


3 


1 


Audit 


3 


13 


0 


Uncategorised Total 


12 


21 


5 



the root account is paramount and thus a key focus of Unix security features. 
Further to this, closer examination of the results shows that part of the reason 
for there being many confidentiality and integrity features is that these dimen- 
sions are often closely related. While theoretical models typically separate the 
two, in practice this often does not hold with a confidentiality feature typically 
also being an integrity feature (and vice versa). For example, if a user wishes 
to protect a file from being read by other people on the system then they very 
often want to prevent others from modifying or deleting it also. In Unix this 
connection can be plainly seen in a situation where an attacker seeks to gain ac- 
cess to a given identity (typically the superuser’s). Upon achieving this identity, 
immediately all the identity’s rights, both confidentiality and integrity related, 
are conferred upon the attacker. Therefore all features designed to prevent an 
attacker increasing their privilege in some way are effectively both confidentiality 
and integrity features. The final reason for the prominence of features in these 
two dimensions is the rarity of features in the other two. In general it seems that 
the inclusion of availability or audit features represents a deliberate decision to 
specifically include such a feature, whereas a confidentiality or integrity feature 
is often a more “general” security feature. 

For the vulnerabilities, OpenBSD had by far the fewest and Solaris had by far 
the most with Debian sitting almost perfectly spaced between them. In addition 
the severity of the vulnerabilities exhibited by OpenBSD were generally quite 
minor while those of the other systems were often severe (compare Table 2). 
Other observations concerning the vulnerabilities analysed are mostly the same 
as those for features. The same connection between confidentiality and integrity 
appears to exist although it isn’t quite as pronounced. Availability vulnerabilities 
tended to represent problems which allowed attackers to affect normal system 
operation but not to actually increase their level of privilege. Audit problems 
tended to be side-effects of other types of vulnerabilities whereby an attacker 
could manipulate or destroy the audit trail generated by the system due to 
some other security breach. A summary of the scores obtained from the security 
analysis metric are given in Table 2 in Section 4.2. 
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Table 2. Security Analysis Metric Scores 



Features 


Debian 


Solaris 


OpenBSD 


Confidentiality 


5.90 


6.08 


7.50 


Integrity 


5.88 


6.17 


7.38 


Availability 


7.00 


5.75 


6.00 


Audit 


6.90 


5.67 


7.25 


Average 


6.42 


5.92 


7.03 


Vulnerabilities 








Confidentiality 


6.75 


8.13 


4.50 


Integrity 


7.70 


7.40 


4.25 


Availability 


8.10 


7.00 


8.00 


Audit 


8.33 


8.42 


0.00 


Average 


7.72 


7.74 


4.19 


Unsealed Score 


-1.30 


-1.80 


2.80 


Scaling Factor 


1.25 


0.52 


3.60 


Final Score 


-1.0 


-3.5 


10.2 



4.2 Relationships between Development Approach and Security 

Research into the development approach of each of the three systems drew from 
a variety of sources, ranging from available documentation, to interviews with 
developers, to close examination and testing of the systems themselves. Despite 
the large quantity of information available, researching the details of the deve- 
lopment of the software often proved a difficult task. For “open source” or free 
software projects, very few restrictions (if any) are placed on access to develop- 
ment information. However, this does not necessarily mean that development 
decisions are fully documented. For example, even though the source code is 
freely available for inspection, this does not necessarily provide information such 
as the rationale behind certain design or implementation approaches taken. On 
the other hand, proprietary software “projects” will often be reticent to make 
public what they may consider to be sensitive development details. A further 
problem unique to the study of system security is that developers, whether from 
a proprietary or free software project, often appear to view the opportunity to 
talk about security as something of a public relations exercise. While all develo- 
pers are willing to openly state how important they consider security, often they 
will be less willing to give further details and therefore all claims made by inter- 
viewees and in documentation had to be considered with a degree of scepticism 
and in the light of other available evidence. 

The Role of the Phases. Despite the limitations in obtaining details of de- 
velopment practices, very definite trends still emerged in relating development 
methodology to security. Each of the phases appeared to play a particular role 
in determining the security of the final system. While the analysis phase is not 
directly connected with any specific aspect of security, establishing security as 
an important requirement exerts the overall effect of increasing its consideration 
during subsequent phases. For example, security is one of the primary objectives 
of the OpenBSD project whereas Solaris and Debian’s objectives were instead 
centred around profitability, scalability and performance for the former, and 
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promotion of free software for the latter. The link between the design phase of 
development and security is obvious and is closely connected with both the qua- 
lity and number of features included. For example, OpenBSD had 18 features 
with an average score of 7.03 out of 10, compared with 15 and 6.42 for Debian 
and 11 and 5.92 for Solaris (Tables 1 and 2). The role of the implementation 
phase exhibits itself to some extent by considering the vulnerabilities observed. 
However, despite the fact that the majority of these vulnerabilities were coding 
errors, the relationship was not as strong as might have been expected. While 
the Solaris developer interviewees referred to special education programmes on 
secure programming that were being conducted at Sun, Solaris exhibited by far 
the highest number of vulnerabilities (21) with these also being quite severe in 
nature (rated at 7.74 out of 10). At the same time, OpenBSD developers inter- 
viewed suggested that their project did not have a clearly defined coding policy 
and yet the security analysis of OpenBSD only found 5 security flaws of fairly 
minor severity (rated 4.19 out of 10). Instead OpenBSD developers focused on 
cultivating a better understanding of the behaviour of the various system calls 
and library routines being used with the aim of reducing errors overall. However, 
it appears that the key application of this does not occur at the implementation 
phase but rather during maintenance. OpenBSD has a comprehensive and conti- 
nuous code audit for the purpose of finding and fixing potential bugs, regardless 
of whether or not these are necessarily security problems. It therefore appears 
that this process, if conducted as part of routine code maintenance, has a highly 
significant effect on the security of the system. 



4.3 Leveraging Development to Improve Security 

Although not strictly an objective of this study, there are some conclusions that 
can be clearly drawn regarding ways that the development process can be direc- 
ted to improve the security of the final system. In many ways, the development 
approach of the OpenBSD project provides something of a template for tackling 
such a problem. Firstly, during the analysis phase, security must be established 
as a requirement of the final system. This may involve consideration of the po- 
tential environments where the system will likely be deployed and some analysis 
of the threats that it will face so that the level of security required may also be 
determined. Establishing security as a key objective promotes its consideration 
throughout the rest of the development process. Security features developed du- 
ring the design phase need to be integrated into the system from an early stage, 
in harmony with the security requirements determined during analysis; “tacking” 
security on at the end of development limits its effectiveness. Implementation 
needs to be conducted in a manner which is aware of secure programming prac- 
tices, however, perhaps most important is the level of priority given to security 
during code maintenance. A focused, intensive and on-going code audit, while 
potentially requiring a considerable investment of time, appears to produce hig- 
hly effective results. Whether or not companies and volunteer projects will be 
sufficiently interested in implementing such a process is, however, not certain 
given the effort involved and the fact that users of the system will often not see 
tangible evidence of it. 
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Source Code Availability and Security. At present it is a matter of great 
debate as to whether free or “open source” software provides better security 
than proprietary software. Although exploring this issue was not an objective of 
this study, the results do allow a conclusion to be drawn. Firstly, examining the 
arguments of both sides reveals something interesting: they are both arguing the 
same thing. “Open source” advocates suggest that the availability of source code 
means “more eyeballs” ensuring security bugs are found and fixed more quickly. 
In addition, not only are free software developers usually quick to release security 
patches but, since the user has access to the source code to begin with, they may 
fix the problem themselves if need be. On the other hand, proprietary vendors 
argue that allowing anyone access to the source makes it far easier for people to 
find holes to exploit and, by hiding the source code, some protection is gained 
from attack. In other words both parties argue that, if you hide the source code, 
people are less likely to find vulnerabilities. However, is this really the case? And 
which position is correct in terms of increasing security? 

Examining the results from the security analysis provides one possible an- 
swer to these questions. Firstly, comparing the final scores for Debian and Solaris 
(a free software and proprietary system respectively) shows Debian marginally 
ahead although, given the estimated precision of the metric, the advantage is 
slight at best. Therefore a possible conclusion is that source code availability does 
help security although the advantages and disadvantages tend to even out so- 
mewhat. However, examining the results closer reveals something further: while 
Debian recorded 12 vulnerabilities, Solaris had 21 — therefore this strongly sug- 
gests that hiding the source code does not necessarily bring about the degree of 
protection from vulnerability discovery that “closed source” advocates suggest. 

However, this conclusion is based only on comparing Debian and Solaris, 
completely neglecting the results of the OpenBSD analysis. Indeed, OpenBSD’s 
high score suggests that either it or Debian is something of an anomaly. It ap- 
pears that OpenBSD, being an “open source” system automatically has all the 
advantages of such while, at the same time, being only minimally affected by 
any increased potential for security flaws to be found. The author believes these 
results show that source code availability by itself does not significantly impact 
security. The fact that many people can read the code themselves to verify its 
security does not necessarily mean that many are doing so®. Therefore, it is 
OpenBSD’s deliberate auditing process which is responsible for it having so few 
vulnerabilities and not the availability of its source code. Interestingly an inter- 
view respondent from the OpenBSD project agreed with this suggesting that, 
even if the project became proprietary, with the hiring of several key contribu- 
tors the operating system’s low number of vulnerabilities would likely continue. 
This brings hope to proprietary software vendors since it suggests that, should 
they invest sufficient resources in an internal eode auditing programme, they 
need never, for security reasons, consider releasing their source code. 



Nor that they have the necessary security expertise to even identify problems therein. 



9 
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5 Conclusion 

This study attempted to explore relationships between the way security is con- 
sidered throughout the development process and the security provided by the 
fielded system. It took three modern Unix-based operating systems and exami- 
ned information concerning their development to attempt to discern the role that 
security had played. To make this task easier, the “waterfall” or “life-cycle” me- 
thodology was used as a way of modelling the development process by structuring 
it according to four development phases: analysis, design, implementation and 
maintenance. To assess the “real-world” security provided by the three systems, 
a metric was developed which attempts to quantify the often elusive attribute of 
security by comparing the nature and number of security features provided by 
the system with the nature and number of vulnerabilities affecting it. The results 
from this metric allowed for comparisons to be made between effective security 
and the consideration of security during development. The scores from the me- 
tric and the study of the development approaches showed that the OpenBSD 
system, which rates security as one of its highest priorities and considers secu- 
rity issues throughout every stage of development, was by far the most secure 
system studied. These results thus supported the hypothesis that an approach 
which thoroughly considers security issues during every development phase will 
produce a system that is observably more secure than one which does not. It also 
showed the specific role that each development phase played in the security of 
the system and, in particular, the effectiveness of security code auditing. These 
results can now be used by developers to leverage the development process in 
order to increase the degree of security achieved in the final system. 
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Abstract. Security of ordinary digital signature schemes relies on a 
computational assumption. Fail-Stop Signature (FSS) schemes provide 
security for a sender against a forger with unlimited computational po- 
wer by enabling the sender to provide a proof of forgery, if it occurs. In 
this paper, first we propose a new FSS scheme whose security is based 
on discrete logarithm modulo a composite number, and integer facto- 
rization. We provide a security proof of the scheme, and show that it 
is as efficient as the most efficient previousfy known FSS scheme. Next, 
we construct a Threshold FSS that requires collaboration of t out of n 
participants to generate a signature and to prove forgery if it occurs. The 
scheme is equipped with cheater detection (incorrect partial signature) 
which is essential for an effective proof of forgery in Threshold FSS and 
only requires trusted authority during pre-key generation. 



1 Introduction 

Ordinary digital signatures, introduced in the seminal paper of Diffie and Hell- 
man [7], allow a signer with a secret key to sign messages such that anyone 
with access to the corresponding public key be able to verify authenticity of 
the message. Security of an ordinary digital signature relies on a computational 
assumption, that is assuming that there is no efficient algorithm to solve the un- 
derlying hard problem. This means that the security is in computational sense 
and an enemy with unlimited computational power can always forge the signa- 
ture by solving the underlying hard problem. In an ordinary digital signature 
there is no mechanism to protect the signer against possible forgeries; that is if 
a signed message passes the verification test then it is assumed to be generated 
by the owner of the secret key. 

To protect against this attack, Fail-Stop Signature (FSS) schemes are propo- 
sed [22,18,14]. In an FSS, in the case of forgery, the presumed signer is able to 
prove that a forgery has happened. This is done by showing that the underlying 

* This work is in part supported by Australian Research Council Grant Number 
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hard problem is solved, and the system will be stopped - hence the name fail- 
stop. In this way, a polynomially bounded signer is protected against a forger 
with an unlimited computational power. In its basic form an FSS scheme is a 
one-time digital signature that can be used for signing a single. However there 
are ways of extending the scheme to work for multiple messages [4,20,16,1]. In 
an FSS the sender and the receivers are all polynomially bounded, whereas the 
enemy has an unlimited computational power [20,21,17]. 

To measure efficiency of an FSS scheme, a number of criteria, including the 
lengths of the signature, the secret key and the public key, together with the 
amount of computation and communication required for signature generation 
and verification, are used. 



1.1 Previous Works 

The first construction of fail-stop signature [22] uses a one-time signature scheme 
(similar to [12]) and results in bit by bit signing of the message, resulting in an 
impractical system. 

In [15] an efficient single-recipient FSS to protect clients in an on-line payment 
system, is proposed. The main disadvantage of this system is that signature 
generation is a 3-round protocol between the signer and the recipient which 
makes it expensive from communication point of view. The size of the signature 
is twice the length of the message. 

van Heijst and Pedersen [20] proposed an efficient FSS that uses the difficulty 
of discrete logarithm problem as the underlying assumption. In the case of a 
forgery, the presumed signer can solve an instance of the discrete logarithm 
problem, and prove that the underlying assumption is broken. This is the most 
efficient scheme known so far and will be referred to as vHP scheme. 

In [14,17], a formal definition of FSS schemes is given and a general con- 
struction using bundling homomorphism is proposed. The important property of 
this construction is that it is provably secure against the most stringent type of 
attack, that is adaptive chosen message attack [10]. The proof of forgery is by 
showing two different signatures on the same message, the forged one and the 
one generated by the valid signer. To verify the proof of forgery the two signatu- 
res are shown to collide under the ’bundling homomorphism’. vHP scheme is an 
example of this construction. Heijst, Pedersen and Pfitzmann [21] also gave an 
example of this eonstruction that uses the difheulty of factoring as the underlying 
computational assumption of the system [21]. 

In [19], a variation of van Heijst scheme is proposed. This scheme uses the 
cyclic group generated by a point on an elliptic curve and results in shorter 
signature and secret key for the same lavel of security. 



1.2 Threshold FSS 

In many applications group responsibility or commitment is required. For ex- 
ample passing a bill in a parliament, requires consent of a certain majority. 
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Threshold signature schemes [6,3,5] requires collaboration of t out of n signers 
for a valid signature to be produced, while collusion of up to t — 1 signers cannot 
generate a signature. In [19] a threshold FSS in which generation of a signature 
requires collaboration of t out of n signers, is proposed. The scheme requires a 
trusted authority to distribute the key information and lacks an effective method 
of proving forgery. That is, although the authors state that providing a proof of 
forgery requires t signers to collaborate, there is no clear procedure to allow a 
group oi t' < t signers who want to prove a forgery to find another t — t' honest 
participants. This shortcoming makes proof of forgery completely ineffective. 

1.3 Our Contributions 

Firstly, we propose a new FSS scheme whose security is based on two assumpti- 
ons: difficulty of discrete logarithm modulo a composite number and, factoriza- 
tion of integers. The scheme is inspired by an identification scheme proposed by 
Girault [9] and is an example of the general scheme: hence with proven security. 
We compare efficiency of the scheme with that of the vHP scheme, and show 
that the two schemes have equal performance. Secondly, we employ an idea in 
[11] to construct a threshold FSS that does not require a trusted authority for 
key generation. The scheme provides cheater detection and enables the combiner 
to detect and remove a sender who sends junk instead of his partial signature. 
This becomes the essential property that is used to detect dishonest participants 
and so allowing a group oit' ,t' <t senders to form a group of size t and provide 
proof of forgery. 



2 Preliminaries 

In this section, we briefly recall relevant notions, definitions and requirements of 
FSS and refer the reader to [18,17,14] for a more complete account. 

2.1 Notations 

The length of a number n is the length of its binary representation and is denoted 
by |n] 2 . 

The ring of integers modulo a number n is denoted by Z„, and its multiplicative 
group, which contains only the integers relatively prime to n, by Z*. 

2.2 Fail-Stop Signatures Schemes 

Similar to an ordinary digital signature scheme, a fail-stop signature scheme 
consists of one polynomial time protocol and two polynomial time algorithms. 

1. Key generation: is a two party protocol between the signer and the center 
to generate a pair of secret key, Sk, and public key, pk- This is different from 
ordinary signature schemes where key generation is performed by the signer 
individually and without the involvement of the receiver. 
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2. Sign: is the algorithm used for signature generation. For a message m and 
using the secret key Sfc, the signature is given hy y = sign{sk,m). 

3. Test: is the algorithm for testing acceptability of a signature. For a message 

m and signature y, and given the public key the algorithm produces an ok 

? 

response if the signature is acceptable under p^. That is test{pk, m, y) = ok. 
An FSS also includes two more polynomial time algorithms: 

4. Proof, is an algorithm for proving a forgery; 

5. Proof-test: is an algorithm for verifying that the proof of forgery is valid. 

A secure FSS scheme must satisfy the following properties [21,17,14]. 

1. If the signer signs a message, the recipient must be able to verify the signature 
(correctness) . 

2. A polynomially bounded forger cannot create forged signatures that succes- 
sfully pass the verification test (recipient’s security). 

3. When a forger with an unlimited computational power succeeds in forging a 
signature that passes the verification test, the presumed signer can construct 
a proof of forgery and convinces a third party that a forgery has occurred 
(signer’s security). 

4. A polynomially bounded signer cannot create a signature that he can later 
prove to be a forgery (non-repudiability) . 

To achieve the above properties, for each public key, there exists many matching 
secret keys such that different secret keys create different signatures on the same 
message. The real signer knows only one of the secret keys, and can construct 
one of the many possible signatures. An enemy with unlimited computing power, 
although can generate all the signatures but cannot determine which one is 
generated by the true signer. Thus, it would be possible for the signer to provide 
a proof of forgery by generating a second signature on the message with a forged 
signature, and use the two signatures to show the underlying computational 
assumption of the system is broken, hence proving the forgery. 

Security of an FSS can be broken if 1) a signer can constrnct a signature 
that he can later prove to be a forgery, or 2) an unbounded forger succeeds in 
constructing a signature that the signer cannot prove that it is forged. These 
two types of forgeries are completely independent and so two different security 
parameters, k and a, are used to show the level of security against the two types 
of attacks. More specifically, k is the security level of the recipient and a is that 
of the signer. It is proved [14] that a secure FSS is secure against adaptive chosen 
message attack and for all c > 0 and large enough k, success probability of a 
polynomially bounded forger is bounded by k^”. For an FSS with secnrity level 
a for the signer, the success probability of an unbounded forger is limited by 
2 -“". 

A general construction for FSS is given in [14]. The construction is for a 
single-message and uses bundling homomorphisms. Bundling homomorphisms 
can be seen as a special kind of hash functions. 
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Definition 1. [I 4 ] A bundling homomorphism h is a homomorphism h : G ^ 
H between two Abelian groups (G, +, 0) and {H, x,l) that satisfies the follo- 
wing. 

1. Every image h{x) has at least 2"^ preimages. 2^ is called bundling degree of 
the homomorphism. 

2. It is infeasible to find collisions, i.e., two different elements that are mapped 
to the same value by h. 

Theorem 4.1 [14] proves that for any family of bundling homomorphisms and 
any choice of parameters the general construction: 

1 . produces correct signature; 

2 . a polynomially bounded signer cannot construct a valid signature and a proof 
of forgery; 

3. if an acceptable signature s* ^ sign{sk, m*) is found the signer can construct 
a proof of forgery. 

Moreover for two chosen parameters k and a, a good prekey K and two messages 
m,m* e M, with m 7 ^ m*, let 

T:={de G\h(d) = 1 A (m* - m)d = 0} (1) 

Theorem 4.2 [14] shows that given s = sign{sk, m) and a forged signature s* E G 
such that testfpk, m*, s*) = ok, the probability that s* = sign{sk, m*) is at most 
[Tj/ 2 '^ and so the best chance of success for an unrestricted forger to construct 
an undetectable forgery is bounded by jr]/2'^. Thus to provide the required level 
of security a, we must choose ]Tj/ 2 ^ < 2 ^'". 

This general construction is the basis of all known provably secure construc- 
tions of FSS. It provides a powerful framework by which proving security of a 
scheme is reduced to specifying the underlying homomorphism, and determining 
the bundling degree and the set T. 

3 The Proposed Scheme 

Model 

There is a center, TA, who is trusted by the recipients (and not necessarily by 
the sender), who sets up the system, but is not involved in signature generation 
or verification. There is a polynomially bounded sender, S, who has a secret 
key. A polynomially bounded recipient can verify a signature by using 5’s public 
key. In the case of dispute, the presumed sender can prove that the underlying 
instance of discrete logarithm problem modulo a composite number has been 
broken. The existence oiT A can easily be eliminated by replacing his role in the 
prekey generation phase with a coin-flipping protocol, together with the shared 
generation of the modulus n (using similar method as [ 2 , 8 ]). 
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Prekey Generation 

Given the two security parameters k and a,T A chooses two prime numbers p and 
q, where p = 2 fp' + 1 and q = 2 fq' -\- 1 and f,p', q' are also prime and |/|2 must 
be chosen such that subgroup discrete logarithm problem for the multiplicative 
subgroup of order / in Z* is intractible (for example, by choosing |n|2 ~ 1881 
bits and letting |/|2 w 151 bits [13]), and obtains n = pq. Then, he chooses an 
element a G of order / both modulo p and modulo q. Therefore, the order 
of a modulo n is also /. Let Qf denote the subgroup of Z* generated by a. TA 
also chooses a secret random number a ^ Qf and computes /3 = (mod n). 
Finally, he publishes (n,f,a,f 3 ) and keeps {p,q,a) secret. 

Prekey Verification 

Prekey Verification will be by the sender S verifying 

r 

a-' = 1 (mod n) 

A prekey is good if the above equation holds with equality. 

Key Generation 

S chooses 01,02,61,62 ^ Zf as his secret key, computes 

7i = (mod n) and 72 = (mod n) 

and publishes (71,72) as his public key. 

Signing a Message 

To sign a message m e Zf, S computes 

yi = oi -|- bim (mod /) and y2 = 0,2 + 62m (mod /) 
and publishes (yi, ^2) as his signature on m. 

Testing a Signature 

(2/1 ) 2/2) passes the verification test if the following equation holds with equality: 

7i7™ = ( 3 ^^ (mod n) 



Proof of Forgery 

If there is another signature (2/112/2) ^ message m which also passes the 

verification test, then the presumed signer can produce his own signature (2/1, 2/2) 
on m, and follow the steps below, 

U(Vipy2 _ (^yijfjV2 (mod n) 
a(yi-y[) ^ ff(y'2-y2) (mod n) 
a(y^-y'i) = a‘^(y'2-y2) (mod n) 

(2/1 -y'i) = 0(2/2 - 2/2) (mod /) 

a = (2/1 - 2/1)(2/2 - 2/2)“^ (mod/) 

to obtain a as the proof of forgery. In effect the presumed signer has proven that 
he can solve an instance of discrete logarithm problem which has been assumed 
to be hard. 
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3.1 Security Proof 

The proposed scheme follows the general construction of [14]. 

Bundling Homomorphism 

— Families of groups: Let n = pq. Define Gk = Zf and Hk = Zf. 

— The homomorphism: h(^p^qj^„,0) is defined as: 

^(pyQ.f^a.p) • Zf X Zf I ^ Z^^ Qi , Q 2 ^ Zf , h(^p q f^Q, fj'^ (ui , ^ (mod iT) 

Since our construction follows the general construction in [14], therefore, it 
is required to prove the following theorems [14]. 

Theorem 1. Under Discrete Logarithm and Factorization Assumptions, the 
above construction is a family of bundling homomorphism. 

Proof. To show that the above definition is a bundling homomorphism, it requires 
to show that (definition 4.1 [14]), 

1. For any value p, PL Zf which is an image oi Zf x Zf, there are / preimages in 

Zf. Moreover, given a p as above, it is hard to find a pair (ci, C 2 ) € Zf x Zf 
such that p = (mod n). 

2. It is hard to find two pairs (j/i, 2 / 2 ), (y); 2 / 2 ) ^ Zf x Zf that map to the same 
p e Zf. 

To prove 1, we note that knowing p = = cZ^ (mod n), for (3 = 

(mod n) and ordn{ce) = /, there exists exactly / different values of (ci,C 2 ) in 
Zf, given by c = ci + ac 2 (mod /). Hence, there are / preimages for p in Zf. 
We also note that given p = q;'=i +“‘=2 (mod n), since ord„(a) = /, then finding 
(ci + aC 2 ) is equivalent to solving discrete logarithm in a subgroup of size / in 
Zf which is known to be difficult [13]. 

It is easy to see that if the signer could solve discrete logarithm modulo 
a composite number problem, then he could derive a from a and [3 and find 
another pair of secret key that matches with his public key. On the other hand, 
if the signer could solve factorization problems, then he could obtain 

(3p = (mod p) and f3q = (mod q) 

where for X = a and (3, Xc denotes X (mod c). Since the size of lp ]2 and 
jg ]2 are not chosen such that discrete logarithm in these multiplicative groups 
are hard, then the signer could find d and d and by using a Chinese-Remainder 
Theorem obtain a. This allows him to find another secret key, and construct a 
signature that he can deny at a later stage. Therefore, the system is secure only 
if discrete logarithm modulo composite number and factorization problems are 
hard. 

Property 2 means that it is difficult to find (?/i,'(/ 2 ) and (y],?/^) such that 
aVipy 2 — 0 -^ 1 ^% (mod n). Suppose that there is a probabilistic polynomial- 
time algorithm A that could compute such collision. Then, we can construct an 
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algorithm D that on input (n, /, a, (3), where /3 = (mod n) and ordn{a) = f, 
outputs a as follows: 

First, D runs A, and if A outputs a collision, i.e. (mod n), 

for {yi,y 2 ) A (yi/ 1 / 2 )) then t> computes: 

a^i/ 3^2 = (mod n) 

(yi -y'i) = a{y 2 - V 2 ) (mod /) 

a = {yi - y[){y '2 - V 2 )~^ (mod/) 

D is successful with the same probability as A and almost equally efficient. 
Hence, it contradicts the discrete logarithm assumption. <0 

To find the security level of the signer we follow Theorem 4.2 of [14] and find 
the size of the set T : 

T := {(ci, C 2 ) £ Zf X Z’/|a‘^ 2+“'=2 _ i (mod n) A m'(ci -I- 002 ) = 0} 

for all values of m' between 1 and / — 1, given that the prekey is good. Since 
(0, 0) is the only element of this set, then the size of the set T is 1. <(> 

Theorem 2. In the proposed FSS Scheme a = t. 

4 EfRciency Comparison 

The aim of this section is to compare efficiency of the proposed scheme with 
the best known FSS schemes. Efficiency of a fail-stop signature system has been 
measured in terms of three length parameters: the lengths of the secret key, the 
public key and the signature, and the amount of computation required in each 
case. 

To compare two FSSs we fix the level of security provided by the two schemes 
and find the size of the three length parameters, and the number of operations 
(for example multiplication) required for signing and testing. 

Table 1 gives the results of comparison of three FSS schemes when the secu- 
rity levels of the receiver and the sender are given by k and a, respectively. In 
this comparison, the first two schemes (first and second column of the table) are 
chosen because they have provable security. The first scheme, referred to as vHP 
in this paper, is the most efficient provably secure scheme. The second scheme 
is a factorization based FSS proposed in [21,14]. Column three corresponds to 
our proposed scheme. 

We use the same value of a and k for all the systems and determine the size 
of the three length parameters. The hard underlying problem in all four schemes 
are Discrete Logarithm (DL) problem, Subgroup DL ]13] and/or Factorization 
problem. This means the same level of receiver’s security (given by the value of 
parameter k) translates into different size primes and moduli. In particular, the 
security level of a 151 bits subgroup discrete logarithm with basic primes of at 
least 1881 bits, is the same as factorization of a 1881 bits RSA modulus [13]. 
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To find the required size of primes in vHP scheme, assuming security parame- 
ters (fc, (t) are given, first K = max(fc, a) is found and then the prime q is chosen 
such that I <7 1 2 > K. The bundling degree in this scheme is q and the value of p is 
chosen such that q\p — 1 and (p — l)/q be upper-bounded by a polynomial in K 
(page 237 and 238 [17]). The size of |p |2 must be chosen according to standard 
discrete logarithm problem, which for adequate security must be at least 1881 
bits [13]. However, the size of lg ]2 can be chosen as low as 151 bits [13]. Since 
jp ]2 and jg ]2 are to some extent independent, we use K to denote jp] 2 . 

In the factorization scheme of [14], the security level of the sender, cr satisfies 
T = p + a where r is the bundling degree and 2^ is the size of the message 
space. Security parameter of the receiver, k, is determined by the difficulty of 
factoring the modulus n. Now for a given pair of security parameters, (fc, a), the 
size of modulus Nk is determined by k but determining r requires knowledge 
of the size of the message space. Assume p = \p \2 ~ 1912 = Nk/2. This means 
that T = a -\- Nk/2. Now the efficiency parameters of the system can be given as 
shown in the table. In particular the size of secret and public keys are 2{T + Nk) 
and 2Nk respectively. 

In our proposed scheme, the bundling degree and hence the security level 
of the sender is a = r = j/] 2 . The security of the receiver is determined by 
the difficulty of factorization of n and discrete logarithm in a subgroup of size 
/ in Z*. Assume that lp ]2 « I 17 I 2 ~ and ]nj 2 « c x j/] 2 . Then, given the 
security parameter (fc, a), we must find which is the modulus size for which 
the hardness of factorization is k. Next, we find Fk^Nk which is the minimum 
size of a multiplicative subgroup of Z* for which subgroup discrete logarithm 
has hardness k [13]. Finally, we choose K = max(Nfc_Arfc , ct), and set I/I 2 = K. 
With these choices, the sender and receiver’s level of security is at least a and 
k, respectively. We denote K to represent |n[ 2 . 

Table 1. Comparison of efSciency parameters 





DL[20] 


Fact [14] 


Our FSS 


PK (mult) 


4A 


2K 


4A 


Sign (mult) 


2 


K 


2 


Test (mult) 


3A 


2K + a 


3K 


Length of SK 
(bits) 


4A 


AK + 2a 


4A 


Length of PK 
(bits) 


2K 


2K 


2K 


Length of 
a signature 
(bits) 


2K 


2K + a 


2K 


Underlying 
hard problem 


DL 


Fact 


DL & Fact 



We note that the efficiency parameters in our proposed scheme are the same 
as those of vHP scheme. In vHP scheme, to achieve the adequate security, K 
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must be chosen to be at least 151 bits, and K must be at least 1881 bits [13]. 
These are also the values required by our scheme. Our scheme outperforms the 
factorization scheme of [14]. 

5 Threshold FSS Scheme 

In a threshold FSS, signature generation requires collaboration of t out n signers. 
Fail-stop property must hold for each partial signature as otherwise an unbo- 
unded enemy can corrupt one of the signers and create a signature that cannot 
be proved to be a forgery. In the following we propose a threshold FSS that has 
an added property: cheater detection. This means that the combiner can verify 
correctness of partial signatures and so a malicious signer cannot disrupt the 
signature generation by submitting junk instead of valid partial signatures. This 
is also an essential property in providing an effective method of proving forgery. 

Model 

There is a group of N polynomially bounded senders, Q = 5i,iS2, • • • ,iSat, a 
polynomially bounded combiner C, who is only trusted in combining partial 
signatures, and a trusted authority T A who is only active in pre-key generation. 
A group member Si is assigned an element Xi £ GF{f). 

Adversary 

The unlimited enemy £ can corrupt up to t — 1 group members. We assume there 
exists at least t honest participants. 

Prekey Generation 

Prekey generation is similar to the one given in section 3. T A generates {p, q, a), 
publishes (n, /, a, (3) and keeps (p, g, a) secret. The role of TA completed at the 
stage and the rest of steps does not require the TA. 

Prekey Verification 

Prekey verification is similar to section 3. 

Key Co-generation 

Each signer Si chooses a secret key consisting of four randomly chosen elements of 
GF{f), that is: ( 0 , 1 , 0 ^ 2 , bn,bi 2 ) £ GF{f), and construct shares of his secret key 
for other users. For this purpose he randomly selects four polynomials of degree 
t — 1 over GF{f), denoted by eij{x) and fij{x), j = 1, 2, such that 6^(0) = aij 
and /ij(0) = bij, for j = 1, 2. The share for 5^ {k = 1,2, ■ ■ ■ ,n; k i) is, 

a-ikj = Gjixk) (mod/) 
bikj = fij{xk) (mod/) 



where j = 1, 2. 

Next, he constructs a public key to be used by Sk, as 



Piki = (mod n) 

Pik 2 = (mod n) 



(2) 
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ans his own public key, 



(mod n) , . 

Pi2 = (mod n) 

The public keys in equation 2 and 3 are placed in a secure public directory, and 
the shares of his secret key are sent to the other Sk via a secure channel. 

The group public key is 

n 

7i = n 

= «Si=i “’2 (mod n) 

n 

72 = ]^ Pi2 (mod n) 

= (mod n) 



Message Co-signing 

Without loss of generality, assume that signer Si, - ■■ ,St want to sign a message 
m e GF{f). Each Si generates his partial signature as, 



yU — T ^ ^ O-jii X 



j=t+l 



t 



-Xk 
Xi - Xk 



^ ri 



j=t+i 



k—l,k^i 



-Xk 
Xi Xk 



X m 



(mod /) 



for £=1,2. Then, Si sends yn and to the combiner C. 

Partial Signature Verification 

C can verify the correctness of Si’s partial signature by checking if 





n‘ 

i Lk = l,k^i 






X 




/ n ^ 

n 




\j=t+l ) 


/ 



a^’1/3^’2 (mod n) 



holds. If the equation does not hold, then the partial signature is rejected. 
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Threshold Signature Generation 

After receiving the shares from Si, where i = 1, ■ ■ ■ ,t, C can construct the FSS 
signature by calculating 

t 

= y^e (mod /) 

i=l 

for ^ = 1,2. This gives the signature on message m G GF{f) as {yi,y 2 ) £ GF{f). 

Signature Verification 

The signature {yi,y 2 ) on message rn can be verified by testing whether 

(modn) 



Proof of Forgery 

Suppose a subgroup Q oi t' < t signers want to show that a message, signature 
pair, (m, y) is forged. They need to find t — t' honest signers to construct a second 
signature on rn. This can be done as follows. 

1. The group submit their partial signatures to the combiner and request proof 
of forgery algorithm to be run. 

2. The combiner collects (m, y) and the t' partial signatures and sends a request 
for partial signature on m to signers Pi^, ..Pi^, rejects the invalid signatures 
and continues until t — t' valid signatures are received. This step will finish 
because we assume there does exist a group of t honest signers. 

3. The combiner constructs a signature y' or m which can be used by as a 
proof of forgery. 

5.1 Security Analysis 

We need to consider two types of attack. Attack from, 

1. a colluding group of size at most t — 1 polynomially bounded signers; 

2. an enemy with unlimited computational power {Security for the signers); 

In both cases the aim of the attacker(s) is to forge a signature that is acceptable 
by the receiver and cannot be proved to be a forgery. 

Theorem 3. Collusion of {t — 1) polynomially signers who have access to pre- 
vious communications and signature cannot generate a forged signature. 

Proof. ( sketch:) We provide the proof of security by using a simulation argument 
for the view of the enemy and showing that an enemy who has access to all the 
key information of the (t — 1) corrupted group members and the signature on 
m could generate by itself all the other public information produced by the 
protocol. 

Without loss of generality, we consider only the hrst t group members 81 , 82 , 

■ ■ • ,St. Assume that the enemy 8 has corrupted 5i, ^ 2 , ■ • • , St-i and has learned 
their secrets. We give a simulator SIMU for our scheme. The input to the 
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simulator SIMU is the message m together with the signature on it, (j/i,t/ 2 )- 
However, the secret information held by St is never exposed and is not simulated. 
The simulator SIMU works as follows. 

1. Key Generation 

— Chooses (flii, bii,bi 2 ) & GF{f), for i = 1, • • • t — 1. 

— Executes the key co-generation step as in the original scheme. At the end 
of the key co-generation phase of the simulation, each Si {i = 1, • • • , t — 1) 
holds the following values. 

a) ail,a^ 2 ,bil,b ^2 e GF{f); 

b) P^l,Pi2 e Z*; 

c) Piki,Pik 2 e for fc = 1, 2, • • • n; fc z 

2. Message Co-Signing and Verification 

— Compute partial signature using the original scheme, namely ya, for 
i = 1, • • • ,t — 1 and £ = 1,2. 

— Compute 

t-i 

yt£ = y£-J2 

i=l 

for € = 1, 2. 

It is straightforward to verify that the view of the enemy S on execution of the 
protocol, and its view on execution of SIMU are statistically indistinguishable, 
and so the result follows. 0 

To analyze sender’s security we consider collusion of the two types of enemy. 
That is, assume that the unbounded enemy has access to the secret key of t — 1 
signers wants to forge the partial signature of another signer not in the colluding 
group. 

Theorem 4. The success chance of an unbounded enemy, who is assisted by 
{t — 1) polynomially bounded signers, to forge a signature that cannot he shown 
to be a forgery by a group oft honest signers is 2~U 

Proof (sketch). We assume a valid pair of message, signature and all the cor- 
responding partial signatures are known and the colluding enemies want to 
obtain the secret of another signer. Without loosing generality, assume that 
have generated a correct signature (yi,z/ 2 ) for m G GF{f), and in 
fact, 5i, • • • ,St-i are colluding members. Hence, they can construct: 

i-l 

yu = ye-'^ Vie (mod /) 

for £ = 1, 2, where 
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Since the colluders know 



Cl 



j=t+l 



n 






-Xk 
X% Xk 



(mod /) 



and 



C2 



j=t+i 



t 



n 

k—l,k^i 



-Xk 
X% Xk 



(mod /) 



the equation can be rewritten as (for £=1,2): 



yu - Cl - C 2 = au + bum (mod /) 



or 

Wi = an + bnm (mod /) 

W 2 = bn + bt 2 m (mod /) ^ 

With the additional knowledge of the unbounded enemy who is able to solve the 
underlying hard problem, from the public key of St, they can construct 



W 3 = an + aat 2 (mod /) (from Pn) .-x 

WA = bn + abt 2 (mod /) (from 

where a is obtained from logg,(/3). Combining equation 4 and 5, the following 
equation is obtained: 



wi = an + bnm 
W2 = at2 + bt2m 
W3 = an + aat2 
wa = bn + abt2 



(mod /) 
(mod /) 
(mod /) 
(mod /) 



This equation can be rewritten as 



f Wi\ 




£ 1 0 m 0 ^ 




( 


IV2 




0 1 0 TO 




at2 


W3 




1 a 0 0 




bn 


\W 4 ) 




0 1 a ) 




\bt 2 J 



(mod /) 



It is easy to see that this matrix has rank = 3 (this is true because ra -I- mr 4 — 
ri —ar 2 = 0, where r, is the row of the matrix, and noting that the submatrix 
consisting of the first 3 columns has 3 independent rows), and therefore, there 
are / different solutions which are equally likely to this equation. <0> 



Theorem 5. The proposed Threshold FSS scheme is secure for the receiver un- 
der DL assumption. 

Proof (sketch). We provide a proof by contradiction. Assume that there is an 
algorithm A which on input (sfei, ■ • • , Skt) (where Ski denotes the secret key of 




306 



R. Safavi-Naini and W. Susilo 



Si) and a message m can produce two different signatures on m, namely (yi, 2/2) 
and {yi,y' 2 )- Then, we can construct an algorithm D that on input (n, /,«,/?), 
where [3 = (mod /) and ord„(a) = /, outputs a as follows: 

Firstly, A is run with an arbitrary m & Zf, and the two signatures on it will be 
produced. Then, D computes: 

a^i/3^2 = a^i/ 3^2 (mod n) 

[yi -y'i) = a{y 2 - V 2 ) (mod /) 

a = {yi - y'i){y '2 - y 2 )~^ (mod/) 

D is successful with the same probability as A and almost equally efhcient. 
Hence, it contradicts the DL assumption. </ 

6 Conclusion 

We have constructed a new FSS scheme based on discrete logarithm and facto- 
rization and have shown that it is as efficient as the most efficient known FSS 
scheme, namely vHP scheme. We have also extended the proposed FSS scheme 
to a threshold FSS scheme, in which key generation does not require a trusted 
authority, and is equipped with cheater detection which is essential for provi- 
ding proof of forgery in a distributed environment. The threshold scheme has the 
same threshold t out of n, for generation of signature and for proving forgery. 
We note that the two thresholds need not be the same and in fact ideally a single 
signer should be able to prove a forgery has occurred. This is an interesting open 
problem that needs further research. We have also shown a complete proof of 
security for our proposed scheme. 
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Abstract. Signcryption is a public- key cryptographic primitive intro- 
duced by Zheng, which achieves both message confidentiality and non- 
repudiatable origin authenticity, at a lower computational and communi- 
cation overhead cost than the conventional ‘sign-then-encrypt’ approach. 
We propose a new signcryption scheme which gives a partial solution 
to an open problem posed by Zheng, namely to find a signcryption 
scheme based on the integer factorization problem. In particular, we 
prove that our scheme is existentially unforgeable, in the random oracle 
model, subject to the assumption that factoring an RSA modulus N = pq 
(with p and q prime) is hard even when given the additional pair {g, S), 
where g 6 is an asymmetric basis of large order less than a bound 
S/2 < ^/N. 



1 Introduction 

Confidentiality and non-repudiatable origin authenticity of transmitted infor- 
mation are important requirements in many applications of cryptography. The 
conventional approach to meeting these goals is the ‘sign-then-encrypt’ techni- 
que, where the message originator produces a digital signature on the message 
using his secret key and then encrypts the signed message with the recipient’s 
public key. 

Several years ago, Zheng [15] introduced a public-key cryptographic primitive 
called signcryption, which achieves both confidentiality and non-repudiatable 
origin authenticity at a lower computational and communication overhead cost 
than the ‘sign-then-encrypt’ technique. The original signcryption schemes were 
based on the discrete logarithm problem DLP{p, q) in a multiplicative subgroup 
of prime order q in for p prime. These signcryption schemes are currently 
regarded as secure because there is no known efficient algorithm for solving 
DLP{p,q). However, the fact that DLP{p,q) is currently not a provably dif- 
ficult problem (and hence vulnerable to algorithmic breakthroughs) motivates 
the search for signcryption schemes based on other difficult problems which 
are computationally independent of DLP{p,q). Consistent with this approach, 
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Zheng posed, in his original paper [15], the open problem of finding a signcryp- 
tion scheme based on the integer factorization problem, which appears to be 
computationally independent of DLP{p,q). 

In this paper, we propose a partial solution to Zheng’s open problem. In par- 
ticular, we modify an efficient signature scheme recently proposed [9] by Point- 
cheval to obtain a new signcryption scheme heuristically based on the Composite 
Discrete Logarithm problem CDLP{N, g, S, y) with suitably chosen parameters, 
namely: Given {N,g, S, y), where A" is a composite RSA modulus N = pq (with 
p and q large primes, p — 1 and q — I non-smooth), g is an element of 
satisfying 1 ^ Ord^^(g) < S/2 '/N (with Ord^^(Sf) having no small 

prime factors besides 2) and ?7i2(Ord_^* (y)) ^ m 2 (Ord^»( 5 )) (where m 2 {z) 
denotes the largest a such that 2“ divides z), y e< g > (where < g > deno- 
tes the multiplicative subgroup of generated by g), find x G IN such that 
g^ = y mod N. 

For the above choice of parameters, CDLP{N, g, S,y) is harder than the 
problem of factoring N given (N, g, S) because (by Lemma I in Appendix) a 
non-zero multiple of Ord^* (g) reveals the factorization of N, and it is easy 
to compute such a multiple using an oracle which solves CDLP{N, g, S, y). We 
conjectnre that, assuming the symmetric cipher and one-way hash functions used 
by our scheme have no individual weaknesses, any security break of our scheme 
is harder than factoring N . To support this claim, we prove, in the random 
oracle model, that under the assumption that factoring N given {N, g, S) is 
intractable, our scheme is existentially unforgeable under an adaptively chosen 
message attack. The confidentiality of our scheme is based on the Difhe-Hellman 
problem in with base g, which is also believed to be harder than factoring 
N. 

We call our proposal a ‘partial’ solution to Zheng’s problem for the following 
reasons: 

(1) Our scheme’s unforgeability is proven with respect to a non-standard 
factorization problem, due to the additional knowledge of the element g G 
whose order is large but smaller than a known bound S VN. However, with a 
suitable choice of parameters to make direct CDLP attacks infeasible, the pro- 
blem appears to be as hard as the standard RSA modulus factorization problem. 

(2) Our scheme requires a trusted authority for generating the common pa- 
rameters shared by all users. 

The rest of the paper is organized as follows. In Sect. 2 we review related 
past work. In Sect. 3, we detail our new signcryption scheme SCT. In Sect. 4, we 
sngge.st practical values our scheme’s parameters and compare its efficiency with 
earlier schemes. In Sect. 5, we present definitions of security notions for general 
signcryption schemes, and state these properties for our particular scheme. Due 
to lack of space the proofs are omitted and included in the full paper, available 
from the authors. Finally, Sect. 6 contains concluding remarks. 
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2 Background 

In 1989, Schnorr [13] proposed well-known efficient 3-move zero-knowledge iden- 
tification and a corresponding signature scheme based on the difficulty of com- 
puting discrete logarithms in a subgroup of 7Z*^ (for p a public prime) gene- 
rated by public element g G of public prime order q. We quickly review 
the Schnorr signature here to allow the following schemes to be explained. The 
signer Alice picks a random secret key sa G and publishes her public key 
va = mod p. To sign a message m, Alice picks a random r G and 
computes the signature consisting of the pair {e,y), with e = H{m,g'^ modp), 
and y = r + e ■ Sa mod q, where H(.) denotes a ‘one-way’ hash function. Bob 
verifies that the pair (e', y') is Alice’s signature on message m' by checking that 
e' = H{m',gy u^). An attractive feature of Schnorr’s scheme is its low on-line 
computational cost, since the time-consuming modular exponentiation g"^ mod p 
is message-independent and can be performed off-line. 

In 1991, Girault [6] proposed a variant of Schnorr’s scheme, replacing the 
group Zp by the group ^)v, where N = pq {p,q prime). In addition, Gi- 
rault proposed the use of a subgroup generator g G maximal order 

\{N) — lcm(p — 1, g — 1). Since Miller’s factoring algorithm (see [12]) factors N 
efficiently when a multiple of \{N) is known, the order of the public generator g 
must be kept secret from all users. This has the implication for the signing algo- 
rithm that the modular reduction in the computation of y cannot be performed. 
Indeed, Girault proposed to eliminate the modular reduction so y is computed 
in the integers. This further improves the on-line computational efficiency at the 
expense of a longer resulting signature. In 1998, Poupard and Stern [11] analy- 
sed Girault ’s scheme and proved its security relative to the discrete logarithm 
problem in the subgroup < g > of However (see Sect. 4), their proof does 
not appear to apply to the efficient variants they propose, in which the size of 
the secret key space S is much smaller than '/N. In PKC 2000, Pointcheval [9] 
proposed a new variant of Girault’s scheme in which S VN yielding an ef- 
ficient scheme as before, but g is chosen differently to allow a proof of security 
relative to factorization. In particular, Pointcheval proposed to reduce the order 
of g from \{N) to a large value less than 5/2, so that at least two secret keys are 
mapped to each public key. Then the reduction from factorization to breaking 
the scheme could be performed (in the random oracle model, with Pointcheval 
and Stern’s forking technique [10]) using a variant of Miller’s factoring algorithm 
(which requires an additional condition on the choice of g) and the fact that an 
attacker cannot distinguish between signatures made, respectively, with two keys 
mapping to the same public key (i.e. the scheme is witness indistinguishable). 
Our signeryption scheme is based on Pointcheval’s approach. 

3 New Signeryption Scheme 

Our signeryption scheme SCJ^ is defined as follows. In the following, when we 
say X ‘large’ we mean that log{X) = b ■ k where 5 is a positive constant and k 
is the security parameter. 
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To set up a cryptographic ‘community’ using the scheme SCT, a trusted 
authority generates and publishes three parameters: 



Common Parameters Published by Trusted Authority 

(1) A large RSA modulus N = pq with p and q random primes of approxima- 
tely equal length. 

(2) A large secret-key bound S <C y/N. 

(3) An element g e 2Z]^ of large order 1 ^ Ord^* {g) < S/2, which is an 
asymmetric basis in i.e. the multiplicity of 2 in Ord^* {g) is not equal 
to the multiplicity of 2 in Ord;^* (g). 

We remark that Pointcheval’s [9] definition of an asymmetric basis in is a 
special case of our more relaxed definition. 

Then user Alice generates a secret/public key-pair as follows: 



Key-pair generation by user Alice 

(1) Alice picks her secret key integer randomly and uniformly in the in the 
interval {0, ..., S — 1}. 

( 2 ) Alice computes and publishes her public key va = g^“^ mod N . 

User Bob generates his key-pair sb and Vb in the same way. 

When Alice wishes to send Bob a confidential message m such that Bob can 
verify that Alice is its originator, Alice follows the following steps: 



Signcryption of m by Alice the Sender 

Step A.l Alice picks uniformly at random an element r from the set of integers 
{0, ..., R — 1}, where R is such that 2^' is large. 

Step A. 2 Alice obtains a trusted copy of Bob’s public key, computes x = 
Vg mod N and then ‘splits’ this into the pair (x 2 ,xi) = Hi(x), where Hi(.) 
denotes a ‘one-way’ hash function, which maps arbitrarily long inputs into 
a string of length |Hi| bits. 

Step A. 3 Alice uses a secure symmetric encryption algorithm E (with mat- 
ching decryption algorithm D) to encrypt m using key Xi to obtain the 
ciphertext c = E(xi, m). 

Step A. 4 Alice uses her secret key to compute the pair (e,y) defined by e = 
KH{x 2 ,m,bind) and y = r + e ■ sa (note absence of modular reduction), 
where KH(.,.) denotes a ‘keyed’ hash function of output length |KH| bits, 
with first argument being the key. In bind Alice inserts her own and Bob’s 
public keys. 

Step A. 5 Alice sends the ‘signcryptext’ triple (c,e,y) to Bob. 

The recipient Bob receives (c', e', y') (which may not be equal to (c, e, y) due to 

modification by an attacker), and follows the following steps: 
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UnSigncryption of (c',e',j/') by Bob the Recipient 

Step B.l Bob uses a trusted copy of Alice’s public key and his own secret key 
to compute x' = {g^ mod N and then, using the same procedure 

followed by Alice to split x, Bob splits x' into {x 2 ,x[) = Hi(a;'). 

Step B.2 Bob decrypts c' using the symmetric key x[ into m' — D(x^,c'). 

Step B.3 Bob accepts message m' as being originated from Alice if and only 
if e' = KH{x 2 , m' , bind). 



We note that a simple way for Alice to perform the split at (Step A. 2) is to 
set xi = H^(a:) and X 2 = (x), where H^(x) and H^(x) denote the |H^| least- 

and |H^| most- significant bits of Hi(x) respectively, with |H^|-f|H^| = |Hi| and 
both and are large. Also in practice one can implement KH using a 
one-way hash function H 2 (.) by concatanating the key X 2 and the input m and 
then hashing the pair (x' 2 , rn). The bind information containing Bob’s public key 
prevents a signcryptext sent by Alice to Bob from being transformed into a valid 
signcryptext carrying the same message from Alice to a third user colluding with 
Bob (see [15] for details on this ‘double-payment’ problem). 

4 EfRciency and Common Parameter Generation 

4.1 Communication Overhead 

The communication overhead of our scheme (where jxj = [log 2 (x)] denotes the 
bit length of x) is 



CommscT = |e| + \y\ = |KH(.)| + |i?| = 2|KH(.)| + |S| + fc' (1) 

compared with the communication overhead of the original signcryption scheme 
SCSI [15]: 

Commscs = |KH(.)| + [gl (2) 

and with the RSA sign-then-encrypt method, assuming Alice and Bob have 
moduli Na and Nb respectively. 

Comm RSA = [AaI -f [A'sl (3) 



4.2 Computational Cost 

The on-line computational cost of our signcryption algorithm S is dominated by 
the integer multiplication e • sa while its off-line computational costs is domi- 
nated by the modular exponentiation g'^ mod N . The number of bit operations 
for these computations can be estimated as follows. For these estimates, we 
assume the well-known square-and-multiply exponentiation algorithm and clas- 
sical arithmetic for multiplication (cost of computing x-y is approximately |x||y|) 
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and modular reduction (cost of computing x mod y is approximately (|a:| — |y|)|?/| 
when |x| > |t/|). 

5 I Oil'll (4) 

and 

Offline S = l-5|^l(2|iVn = 3(|KH(.)| + |5| + k')\N\^ (5) 

For comparison, the signcryption scheme SCSI has computational costs 

Compscsi, Online S = + (1*^1 + “ l«DI?l = 2|KH(.)||g| (6) 

and 

Offline S = ^kl IPp- (7) 

In the RSA ‘sign-then-encrypt’ technique, the signing exponentiation (to Alice’s 
secret key) of the hashed message must be peformed online. We assume an effi- 
cient RSA scheme, in which small public exponents are used (so we neglect the 
computational cost of the encryption exponentiation to Bob’s public exponent) 
and the Chinese Remainder Theorem (CRT) is used to perform the exponentia- 
tion to Alice’s secret key (even though this forces Alice to store the secret prime 
factors of her modulus) : 

Comp^gj^ Q^Yine, S “ (^/4) ' I-^aI (8) 

The unsigncryption operation is dominated by the two exponentiations in the 
computation of x' = {g^ mod N. Using a simple generalization of the 

square-and-multiply algorithm as suggested by Shamir (see full paper version 
of [15]), one can compute a product of two exponentials mod N using on 

average 1.5(|/?i| — |/32|) + 1-75|/32| multiplications modulo N (assuming without 
loss of generality that |/5i| > |/32|)- In the case of our unsigncryption operation, 
the two exponents are /?i = y'-SB and /32 = e'-ss with lengths \f3\\ = 2|5'|-|-|KH|-f 
k' and |/32| = |KH| -|- j^l, so the average bit operation cost of the unsigncryption 
of SCT is estimated as: 

Compsc:F,v = (1-5(|5| + k') + 1.75(|A| + |KH|))2|7Vk (9) 

This should be compared with the cost of unsigncryption for SCSI (where both 
exponents can be reduced mod q prior to performing the exponentiation): 

CompscsiM = l-75|g|(2|pk) (10) 



The computational cost of the ‘decrypt-then-verify’ RSA algorithm is identical 
to that of ‘sign-then-encrypt’ (except that Bob’s secret key / modulus are used 
to decrypt). 
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4.3 Choice of Parameters 

As we saw above, efficiency considerations encourage the use of a relatively 
small S Vn and N. However, our scheme is trivially breakable by sol- 
ving CDLP{N,g,S,VA) to extract a secret key from the public one using, for 
example. Shank’s ‘Baby-Step, Giant Step’ algorithm, which takes about y/S 
multiplications in Therefore, S can be chosen such that a breaking time 

Tcdlp = '/STmult{\N\) (where Tmult{\N\) estimates the time to perform a 
multiplication in 2Z*jq) represents a sufficient security level. Our security analy- 
sis suggests that there exist no significantly faster attacks, as long as the time 
Tfact required to factor N given {N,g,S) is not less than Tcdlp- But when 
Ord^» {g) is composed of large prime factors, we know of no algorithm more ef- 
ficient than CDLP for factoring N which makes use of (g, S). Therefore, we can 
choose I A' I and |5| so that the time T 7 VFs(|Af|) required to factor N using the 
fastest known algorithm for the standard factorization problem (namely Number 
Field Sieve) is not less than Tcdlp- 

Tables 1 and 2 compare the efficiency of our scheme with the earlier sign- 
cryption scheme SCSI and RSA ‘sign-then-encrypt’ with comparable security 
level. The assumptions made in constructing the table are the following: 

(1) Selection of |A| and lAI: Consistent with the discussion above, we chose 

jS'l and |A| such that Tcdlp{\S\, |A|) = T 7 vfs(|A|). We used the assumptions of 
Lenstra and Verheul [8] to estimate the functions Tmfs and Tcdlp- In particu- 
lar, we took Tatfs = c-L(2l^i), where L(A) = exp(1.923-ln(A)^/^dn(ln(A))^/^) 
denotes the well-known heuristic running time estimate for NFS and the con- 
stant c fixed by the 10^ MIPS- Years (MY) effort taken recently to factor the 
512-bit modulus RSA-155 [3]. We took Tcdlp{\N\, I^I) = , with the 

constant h fixed by the (2.2 • 10® • \/2/9) MY effort estimated for Tcdlp{\N\, | A |) 
with |A| = [S'! = 109 bit (we refer the reader to [8] for more details). Since 
the signeryption scheme SCSI can be attacked using a DTP variant of the NFS 
algorithm in with time estimate Tnfs{\p\)j or alternatively, using Shank’s 
baby-step giant step algorithm in Z* in time we set for comparable security 
level between the schemes SCSI and SCP, |g| = [S'] and |p| = |A|. Similarly, we 
assume the RSA modulus length |A^| = |Af| is equal to |A| in SCP. 

(2) Selection of |KH(.)|: Given a signeryptext (c, e, y) from Alice to Bob, Bob 
can create an existential forgery (c', e, y) of a signeryptext from Alice to Bob by 
finding a new message m' such that KH((g^n^)“®® , m', 6m<i) = e. If KH (.) is a 
collision-resistant hash function. Bob would be expected to invest about 
operations to complete this attack. Therefore, to be consistent with the security 
level defined by the choice of [S’! and |A|, we assume |KH(.)| = |S'|/2. 

(3) Selection of k': From the statement of Theorem 1 in Sect. 5.3, one can see 
that the lower bound on the factoring algorithm success probability is significant 
only when 2^ is much greater than the number of queries an active attacker is 
allowed to make to the signeryption algorithm. Since the number of queries an 
active attacker can make is normally much lower than the number of offline 
operations available to him, we assume that a choice of fc' = |KH(.)| will satisfy 
the above requirement. 
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Of particular note is the efficiency tradeoff offered by our scheme compared 
with the earlier signcryption scheme, namely the on-line signcryption compu- 
tational cost has been cut by about half due to the absence of the modular 
reduction, but this at the expense of an approximate doubling in the communi- 
cation overhead, and an increase in the unsigncryption computational cost for 
the same reason. 



Table 1. Comparison of Communication overhead of proposed signcryption scheme 
SCT with RSA based Signature-Then-Encryption with Small Public Exponents and 
CRT decryption and with original signcryption scheme SCSI. 



security parameters 
|fV| = |p| |5| = |g| |KH(.)|=A:' 


SCT Comm. 
overhead(bits) 


Comm. Overhead 
RSA / SCT 


Comm. Overhead 
SCSI / SCT 


1024 


132 


66 


329 


6.2 


0.6 


2048 


188 


94 


469 


8.7 


0.6 


4096 


263 


131 


657 


12.5 


0.6 


8192 


363 


181 


907 


18.1 


0.6 


10240 


402 


201 


1004 


20.4 


0.6 



Table 2. Ratio comparison of Online (by sender) and Total (by sender and receiver) 
computation cost of proposed signcryption scheme SCT with RSA based Signature- 
Then-Encryption (using Small Public Exponents and CRT decryption) and with ori- 
ginal signcryption scheme SCSI. 



security parameters 
|iV| = |p| |5| = |g| |KH(.)| = fc' 


Online 

RSA/5C.7^ 


Online 

SCS1/5C.7^ 


Total 

RSA/5C.7^ 


Total 

SCS1/5C.7^ 


1024 


132 


66 


4.7E+04 


2.0 


0.74 


0.41 


2048 


188 


94 


1.8E+05 


2.0 


1.04 


0.41 


4096 


263 


131 


7.5E+05 


2.0 


1.48 


0.41 


8192 


363 


181 


3.1E+06 


2.0 


2.15 


0.41 


10240 


402 


201 


5.0E-f-06 


2.0 


2.43 


0.41 



4.4 Common Parameter Generation 

Our unforgeability security proof in Sect. 5.3 requires that the common para- 
meters satisfy (1) N = p ■ q ioi p and q prime, (2) Ord^^(,g) < S/2, (3) g is 
an asymmetric basis in and (4) Factoring N given (g,N,S) is intracta- 
ble. We cannot prove that a practical common parameter generation algorithm 
GenComm exists which satisfies property (4). Based on known attacks on the 
standard factorization problem and CDLP, however, we conjecture that impo- 
sing the following additional properties will allow (4) to be met: (5) p— 1 and q—1 
are not smooth (i.e. have at least one large prime factor) and (6) Ord^* (g) is 
large and composed of large prime factors (besides an unavoidable factor of 2 due 
to asymmetricity requirement). We therefore believe that the trusted authority 
can satisfy all the properties (1) to (6) using the following practical implemen- 
tation of GenComm: (1) Pick distinct random primes and Vq each of length 
(|5| — 6)/2 (2) Pick random odd tp of length |Af|/2 — \rp\ until p = tpVp + 1 






316 R. Steinfeld and Y. Zheng 



passes a primality test. (3) Pick random odd tq of length |iV|/2 — \rq\ — 1 until 
q = 2tqrq + 1 passes a primality test (4) Compute N = pq (5) Pick random 
hp £ Z* until Qp = modp ^ 1 (6) Pick random hq £ S'* (with 

hq mod q ^ q — 1) until mod q ^ I and gq = g ^ 1 (7) 

Using CRT compute g = gpq{q^^ mod p) + gqp{p^^ mod q) mod N. Note that 
the resulting g has order 

= lcm(Ord^.(p),Ord^*(ff)) = lcm(rp,2rg) = 2vprq (11) 

Finally, we remark that the requirement of our scheme to share among all 
users a set of common parameters defining a specihc group, is similar to the 
recommendation in several recent standards (see [5] and [14]) of specihc elliptic 
curves and group parameters, for the implementation of elliptic curve crypto- 
graphic algorithms. 

4.5 Trusted Authority 

As can be seen in Sect. 4.4, the common parameter generation algorithm for 
generating g requires knowledge of the factors {p,q) of N and large factors of 
{p — 1) and {q — 1). Thus the ‘Trusted Authority’ (TA) running the generation 
algorithm must be trusted to not make use of this knowledge to attack the 
system. But we emphasize that: 

(1) Once {g, N, S) have been generated, our scheme has no further need for a 
TA (besides a public key certihcation authority, which is needed for any public 
key scheme), and 

(2) The TA need not manipulate any user secret keys, since users can generate 
keys by themselves. 

Hence, the TA for our scheme can cease to exist after (g, N, S) have been 
generated. The simplest form of TA is a sealed ‘black box’ device implementing 
the algorithm of Sect. 4.4 and programmed to erase from memory all traces of 
(p, q) (and the factors of p — 1 and i? — 1) after g and N have been generated. 
Alternatively, the TA can take the form of a group of users who engage in a 
‘private’ distributed common parameter generation algorithm, which prevents a 
minority of colluding dishonest participants from gaining knowledge on (p, q) . 
We have in mind an efficient protocol similar to that recently proposed by Bo- 
neh and Franklin [2], although the present case is more challenging due to the 
requirements on g. 

4.6 Trading Efficiency for Security 

We observe that by choosing g of maximal order X{N), and S » \/N (eg |5| > 
0.6| Aj) one can use Poupard and Stern’s simulation proof technique [11] to prove 
the unforgeability of our scheme relative to CDLP to base g, which is harder 
than the standard factorization problem (we refer the reader to the full paper for 
more details). In this way, one can trade off the efficiency of our scheme in return 
for a proof of unforgeability relative to the standard factorization problem. 
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5 Security Analysis 

5.1 Preliminaries 

We use the notation A(., .) to denote an algorithm, with input arguments sepa- 
rated by commas (our underlying computational model is a Turing Machine) . If 
algorithm A makes calls to oracles, we list the oracles separated from the algo- 
rithm inputs by the symbol Let T{k) denote an upper bound on the number 
of computation steps performed by algorithm A with input parameter k. We say 
A has a polynomial time bound in k if there exists c £ H, and fco £ IN such that 
T(k) < k‘^ for all k > ko. We say that a function / : IN ^ IR is negligible if, for 
each c £ H, there exists a fco such that f{k) < 1/k^ for all k > ko. We call a 
probability function / : IN ^ [0, 1] overwhelming if the function 5 : IN — [0, 1] 
defined by g{k) = 1 — /(fc) is negligible. 

5.2 Security Notions for Signcryption Schemes 

In this section, we adapt the ‘asymptotic’ digital signature security notions de- 
fined in [7] to signcryption schemes. Our presentation style is similar to that of 
Bellare (see, eg [1] ) , since we believe it makes all the security proof assumptions 
explicit. In order to define precise notions of security for a signcryption scheme, 
we need first a precise definition of a signcryption scheme itself. 

Definition 1. A Signcryption Scheme SCTZ = (GenComm, GenUser, S, U) is an 
ordered sequence of four algorithms: 

1 . A probabilistic common parameter/oracle generation algorithm GenGomm, 
which takes as input a security parameter k and returns a pair (CommPar, O), 
where CommPar is a sequence of common parameters and O = (Hi, H 2 , ..., H|q|) 
is a a sequence of \0\ oracles 0[i] = H; : {0, — > {0, 

2. A probabilistic user key-pair generation algorithm GenUser, which takes as 
input a security parameter k and a common parameter sequence CommPar and 
returns a single user’s secret/public key-pair (sk,pk). 

3. A probabilistic Signcryption algorithm S, which takes as input a security 
parameter k, a common parameters sequence CommPar, a sender’s secret key 
skA, a recipient’s public key pks, and a message m £ M (M is the message 
space), has access to oracles in a sequence O and returns a signcryptext Cs- 

4 . An UnSigncryption algorithm U, which takes as input a security parame- 
ter k, a common parameters sequence CommPar, a recipient’s secret key sks, 
a sender’s public key pkA, a signcryptext Cs, and has access to oracles in a se- 
quence O and returns a pair (m, b), consisting of a message m and a verification 
bit b. 

The translation of the informal definition of the scheme SCP in section 3 
is straightforward. We simply highlight that our proof of security applies to 
the following version of our scheme: (1) The keyed-hash KH(.,.) is implemen- 
ted using concatanation and the one-way hash function H 2 (.) (2) The splitting 
of X into x\ and X 2 is into the least significant and most significant bits of 
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Hi(x) denoted H^(x) and H^(x) respectively (3) The algorithm GenComm ou- 
tputs CommPar = {g,N,S) and O = (Hi,H 2 ,E, D) (4) We shall assume the 
following property of GenComm: The outputs CommPar and O are statistically 
independent (this includes as a special case the most practical version where 
the hash and encryption algorithms in O are derived deterministically from k). 
This assumption simplifies the intractability assumption required in the security 
proof. 

Apart from security considerations, a necessary condition for a signcryption 
scheme to be usable is that it must ‘work’ as expected in the absence of attackers. 
Such a scheme is called complete. 

Definition 2. Define the experiment 

Experiment CompExp(iSC7?., k, m) 

{CommPa,r,0) ^ GenComm(fc) 

{skA,pkA) ^ GenUser{k, CommPar) 

{skBjpks) ^ GenUser(fc, CommPar) 

Cs Y- S(fc, CommPar, skA,pkB, m\0) 

(m',b) <r- {}{k, CommPar, skB,pkA,Cs\0) 

If rn' = m and b = 1 then Return 1 else Return 0 

We say that signcryption scheme SCTZ is complete if, for all messages rn, 
Pr[CompExp(iSCP, k, m) = 1] is an overwhelming probability function in k. 

The completeness of our scheme SCP follows from a simple calculation, na- 
mely, using the notation of section 3 (with all arithmetic performed modA^), if 
[c',e’,y') = {c,e,y) then: 

x' = = {{gr+SA-eg^s.-eyss ^ gH-ss) = ^ (12) 

and therefore {x'l, X 2 ,m') = (xi,X 2 ,m), so the message m is recovered and the 
verification test is passed (by convention, the unsigncryption algorithm U returns 
a verification bit equal to 1 when the test is passed). 

Authenticity security properties for signcryption schemes can be classified si- 
milarly to signature schemes, according to the resources available to the attacker 
and severity of the attacker’s output forgery (see [7] and [10]). We shall take the 
most conservative security notion — we will call a signcryption scheme unfor- 
geable only if it is infeasible for the most resourceful type of attacker (namely 
an ‘adaptively chosen message attacker’) to produce the weakest type of forgery 
(namely an ‘existential’ forgery). 

Definition 3. Let SCTZ = (GenGomm, GenUser, S, U) be a signcryption scheme 
using \0\ oracles. Let PC{1,...,|C>|}. Define the experiment 

Experiment ForgeExp {k,SCTZ, k, R) 

{CommPar, O) ^ GenGomm(fc) 

{skA,pkA) ^ GenUser{k, CommPar) 

{skB,pkB) ^ GenUser(fc, CommPar) 

For each i e {1, ..., jCj} 
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lii e R Pick a random function H[ : {0, {o, 

Else V\\*-0[i\ 

O' 

Cs ^ A(fc, CoTmnPar.pkA,pkB, ■'’kslO' , S{k, CornmPar, skAPpks, ^O')) 
{m,b) \J{k,CommPar,skB,pkA,Cs\0') 

If 6 = 1 and m yf Qa-s[*] for all i then Return 1 
Else Return 0 

We say that signcryption scheme SCTZ is existentially unforgeable under an 
adaptively chosen message attack with respect to random oracle replacement set 
R if, for any algorithm A with polynomial time bound in k, 
Pr[ForgeExp(fc, 5C7?,, A, i?) = 1] zs a negligible function in k. 

In Definition 3, the set R specifies the (indexes of) oracles in the sequence O 
which are to be replaced by randomly chosen functions in the random oracle mo- 
del of the signcryption scheme SCTZ, to yield the oracle array O'. The attacker 
A’s success probability in yielding a valid forgery is then assessed over all ran- 
dom choices made in experiment ForgeExp including the choice of the random 
functions for inclusion in O'. The notation Qa-s[*] refers to the z’th query of A 
to S in the execution of A. 

5.3 Security Properties of New Signcryption Scheme 

We first examine the authentication security of the signcryption scheme SCP. We 
will prove (using the methods of Pointcheval [9]) the unforgeability of the scheme 
SCP by showing (in Theorem 1) how to use a poly-time attacker for SCT, which 
produces a valid forgery with non-negligible probability, to construct a poly-time 
algorithm which factors the public modulus N with non-negligible probability. 
Then the unforgeability of SCP follows (in Corollary 1) from the intractability 
assumption (see Definition 4) on factorization of N given the triple {g, N, S) 
output by GenComm. 

There are two key ideas in performing the reduction from factorization to 
forgery of SCP in the proof of Theorem 1. The first is the use of the ‘forking’ 
technique [10] in extracting (from two executions of the attacker) two signcryp- 
texts (ci,ei,?/i) and (02,62,1/2) with the same ‘commitment’ x* , i.e. satisfying 
)~®® = (5 *^’’'Ca • From these two forgeries, the factoring algorithm can 

then compute a multiple of Ord;^» (g), namely L ss[(y2 ~ ?/i) ~ sa(c 2 — Ci)]. 
Once L is known, it is then easy to factor N using the fact that g is an asym- 
metric basis in (see Lemma 1 in Appendix). Of course, L cannot be used 
to factor N if L is the trivial zero multiple of Ord^* (g). Here we need the se- 
cond key idea, namely the Witness Indistinguishability (WI) of the signcryption 
algorithm S, to show that L ^ 0 with high probability. 

The WI property, first studied by Feige and Shamir [4] , can hold non-trivially 
only for schemes in which the one-way function /(.) transforming the secret 
key space to the public key space is not one-to-one. In fact, an even stronger 
requirement is needed, namely: Eor each public key v, the preimage W{v) = {s : 
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f{s) = u} of u under /(.) contains at least n > 2 secret keys. In the case of our 
scheme, we have /(s) = g~'^ mod N which can be made to map at least n secret 
keys to each public key by choosing the secret key space as {0, . . . , S' — 1} with 
S > ri ■ Ord^^((/). Under this assumption, the statistical WI property of the 
signcryption algorithm S means that the probability distribution of signcryptexts 
output by S is almost independent of which secret key in W{va) is used by the 
sender having public key va- 

Referring back to the reduction above, note that the undesirable outcome 
L = 0 results (apart from the negligibly probable case sb = 0) from the out- 

def 

come h{ei,e2,yi, 1/2) = (?/2 — yi)/(e2 — ei) = It is straightforward to see that 
if the joint probability distribution of singcryptexts observed by the attacker is 
perfectly WI, i.e. completely independent of sa in W{va), then choosing sa ran- 
domly in {0, 1, ..., S — 1} before running the attacker would give the undesirable 
outcome /i(ei, 62, yi, 1/2 ) = sa with probability not greater than e/n, where n is 
the number of secret keys in W(va) and e is the attacker’s success probability in 
computing the multiple L. However, since the signcryption algorithm S is only 
statistically WI, we can only prove that the distribution of the signcryptexts 
observed by the attacker is approximately independent of sa in W{va), and we 
need to bound the effect of this on the probability of the outcome L = 0 to 
ensure that the factoring algorithm still succeeds with non-negligible probability 
even when the attacker queries S a polynomial number of times. 

Theorem 1. Let A be an adaptively chosen message attacker algorithm for sig- 
ncryption scheme iSCJF=('GenComm,GenUser,S,U ), having running time bound T 
and success probability 

ritaf 

e = Pr[ForgeExp(fc,5Cy^,A,{l,2}) = I], (13) 

where ForgeExp is the experiment in Definition 3. Suppose that Ia-S, ^a-Hi 
and Ia-H 2 o,re upper bounds on the number of oracle queries A makes to 5, Hi 
and H 2 , respectively. Let Sl denote a lower bound on S output by GenGomm. 
Define the experiment 

Experiment FactorExp(fc, GenGomm, Fact) 

(y, N, S, O) ^ GenGomm(A:) 
p ^ Fact(fc, g, N, S) 

If p is a non-trivial divisor of N then Return 1 
Else Return 0 

There exists a factoring algorithm Fact(., ., ., .) with running time bound 
T' = 2T + lA-sO{k^) + OHIa-s + lA-H^f) 
and success probability 

t = Pr[FactorExp(fc, GenGomm, Fact) = 1] > €*(£* — 81a-s/2^ )/8 — 1/5l, 



where 
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Proof. Refer to the full paper. □ 

The following is a definition of the factorization intractability property we would 
like to have for the common parameter generation algorithm GenComm of the 
scheme SCP. 

Definition 4. The common parameter generation algorithm GenComm of the 
scheme SCT is said to he factorization-intractable if, for any polynomial time 
factoring algorithm Fact(., .), the factoring success probability 
Pr[FactorExp(fc, GenComm, Fact) = 1] (where FactorExp is the experiment 
defined in Theorem 1) is a negligible function in k. 

Now we can state our unforgeability result for the scheme SCtF as a corollary to 
Theorem 1. 

Corollary 1. For the scheme iSCJF=fGenComm,GenUser,S,UJ, if the common 
parameter generation algorithm GenComm is factorization-intractable, thenSCtF 
is existentially unforgeable under an adaptive chosen message attack with respect 
to random oracle replacement set {1,2}. 

Proof. Refer to the full paper. □ 



6 Conclusions 

We presented a new signcryption scheme and proved its unforgeability in the 
random oracle model with respect to a factorization problem. Due to lack of 
space, all proofs have been omitted are included in the full paper available from 
the authors. Remaining open problems are: (i) To prove the confidentiality of 
our scheme, preferably relative to the factorization problem and (ii) To find 
an efficient distributed common parameter generation algorithm for our scheme 
which leaks no knowledge about the factors of to a minority of colluding 
participants. Finally, an interesting remaining challenge is to find a scheme at 
least as efficient as the proposed one which also satisfies one or both of: (i) 
The scheme is based on the standard RSA modulus factorization problem and 
(ii) Each user has a personal modulus - i.e. the modulus to be factored is not 
common to all users. 

Acknowledgements. The authors would like to thank the anonymous referees 
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7 Appendix 

A central Lemma used to prove the unforgeability of our scheme is the following. 

Lemma 1. There exists an algorithm which, given an RSA modulus N, an 

asymmetric basis g in and a non-zero multiple L of odd{Ord^* (ff)); out- 
puts a non-trivial factor of N in time 0{\L\ ■ |iVp). 

Proof. Refer to the full paper. □ 
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